Skip to content

Computer Systems A Programmers Perspective

Modules

Chapter List

  • A Tour of Computer Systems
    • 1.1 Information Is Bits + Context
    • 1.2 Programs Are Translated by Other Programs into Different Forms
    • 1.3 It Pays to Understand How Compilation Systems Work
    • 1.4 Processors Read and Interpret Instructions Stored in Memory
      • 1.4.1 Hardware Organization of a System
      • 1.4.2 Running the hello Program
    • 1.5 Caches Matter
    • 1.6 Storage Devices Form a Hierarchy
    • 1.7 The Operating System Manages the Hardware
      • 1.7.1 Processes
      • 1.7.2 Threads
      • 1.7.3 Virtual Memory
      • 1.7.4 Files
    • 1.8 Systems Communicate with Other Systems Using Networks
    • 1.9 Important Themes
      • 1.9.1 Concurrency and Parallelism
      • 1.9.2 The Importance of Abstractions in Computer Systems
    • 1.10 Summary
  • Representing and Manipulating Information
    • 2.1 Information Storage
      • 2.1.1 Hexadecimal Notation
      • 2.1.2 Words
      • 2.1.3 Data Sizes
      • 2.1.4 Addressing and Byte Ordering
      • 2.1.5 Representing Strings
      • 2.1.6 Representing Code
      • 2.1.7 Introduction to Boolean Algebra
      • 2.1.8 Bit-Level Operations in C
      • 2.1.9 Logical Operations in C
      • 2.1.10 Shift Operations in C
    • 2.2 Integer Representations
      • 2.2.1 Integral Data Types
      • 2.2.2 Unsigned Encodings
      • 2.2.3 Two’s-Complement Encodings
      • 2.2.4 Conversions Between Signed and Unsigned
      • 2.2.5 Signed vs. Unsigned in C
      • 2.2.6 Expanding the Bit Representation of a Number
      • 2.2.7 Truncating Numbers
      • 2.2.8 Advice on Signed vs. Unsigned
    • 2.3 Integer Arithmetic
      • 2.3.1 Unsigned Addition
      • 2.3.2 Two’s-Complement Addition
      • 2.3.3 Two’s-Complement Negation
      • 2.3.4 Unsigned Multiplication
      • 2.3.5 Two’s-Complement Multiplication
      • 2.3.6 Multiplying by Constants
      • 2.3.7 Dividing by Powers of Two
      • 2.3.8 Final Thoughts on Integer Arithmetic
    • 2.4 Floating Point
      • 2.4.1 Fractional Binary Numbers
      • 2.4.2 IEEE Floating-Point Representation
      • 2.4.3 Example Numbers
      • 2.4.4 Rounding
      • 2.4.5 Floating-Point Operations
      • 2.4.6 Floating Point in C
    • 2.5 Summary
  • Machine-Level Representation of Programs
    • 3.1 A Historical Perspective
    • 3.2 Program Encodings
      • 3.2.1 Machine-Level Code
      • 3.2.2 Code Examples
      • 3.2.3 Notes on Formatting 3.3 Data Formats
    • 3.4 Accessing Information
      • 3.4.1 Operand Specifiers
      • 3.4.2 Data Movement Instructions
      • 3.4.3 Data Movement Example
    • 3.5 Arithmetic and Logical Operations
      • 3.5.1 Load Effective Address
      • 3.5.2 Unary and Binary Operations
      • 3.5.3 Shift Operations
      • 3.5.4 Discussion
      • 3.5.5 Special Arithmetic Operations
    • 3.6 Control
      • 3.6.1 Condition Codes
      • 3.6.2 Accessing the Condition Codes
      • 3.6.3 Jump Instructions and Their Encodings
      • 3.6.4 Translating Conditional Branches
      • 3.6.5 Loops
      • 3.6.6 Conditional Move Instructions
      • 3.6.7 Switch Statements
    • 3.7 Procedures
      • 3.7.1 Stack Frame Structure
      • 3.7.2 Transferring Control
      • 3.7.3 Register Usage Conventions
      • 3.7.4 Procedure Example
      • 3.7.5 Recursive Procedures
    • 3.8 Array Allocation and Access
      • 3.8.1 Basic Principles
      • 3.8.2 Pointer Arithmetic
      • 3.8.3 Nested Arrays
      • 3.8.4 Fixed-Size Arrays
      • 3.8.5 Variable-Size Arrays
    • 3.9 Heterogeneous Data Structures
      • 3.9.1 Structures
      • 3.9.2 Unions
      • 3.9.3 Data Alignment
    • 3.10 Putting It Together: Understanding Pointers
    • 3.11 Life in the Real World: Using the gdb Debugger
    • 3.12 Out-of-Bounds Memory References and Buffer Overflow
      • 3.12.1 Thwarting Buffer Overflow Attacks
    • 3.13 x86-64: Extending IA32 to 64 Bits
      • 3.13.1 History and Motivation for x86-64
      • 3.13.2 An Overview of x86-64
      • 3.13.3 Accessing Information
      • 3.13.4 Control
      • 3.13.5 Data Structures
      • 3.13.6 Concluding Observations about x86-64
    • 3.14 Machine-Level Representations of Floating-Point Programs
    • 3.15 Summary
    • Bibliographic Notes
  • Processor Architecture
    • 4.1 The Y86 Instruction Set Architecture
      • 4.1.1 Programmer-Visible State
      • 4.1.2 Y86 Instructions
      • 4.1.3 Instruction Encoding
      • 4.1.4 Y86 Exceptions
      • 4.1.5 Y86 Programs
      • 4.1.6 Some Y86 Instruction Details
    • 4.2 Logic Design and the Hardware Control Language HCL
      • 4.2.1 Logic Gates
      • 4.2.2 Combinational Circuits and HCL Boolean Expressions
      • 4.2.3 Word-Level Combinational Circuits and HCL Integer Expressions
      • 4.2.4 Set Membership
      • 4.2.5 Memory and Clocking
    • 4.3 Sequential Y86 Implementations
      • 4.3.1 Organizing Processing into Stages
      • 4.3.2 SEQ Hardware Structure
      • 4.3.3 SEQ Timing
      • 4.3.4 SEQ Stage Implementations
    • 4.4 General Principles of Pipelining
      • 4.4.1 Computational Pipelines
      • 4.4.2 A Detailed Look at Pipeline Operation
      • 4.4.3 Limitations of Pipelining
      • 4.4.4 Pipelining a System with Feedback
    • 4.5 Pipelined Y86 Implementations
      • 4.5.1 SEQ+: Rearranging the Computation Stages
      • 4.5.2 Inserting Pipeline Registers
      • 4.5.3 Rearranging and Relabeling Signals
      • 4.5.4 Next PC Prediction
      • 4.5.5 Pipeline Hazards
      • 4.5.6 Avoiding Data Hazards by Stalling
      • 4.5.7 Avoiding Data Hazards by Forwarding
      • 4.5.8 Load/Use Data Hazards
      • 4.5.9 Exception Handling
      • 4.5.10 PIPE Stage Implementations
      • 4.5.11 Pipeline Control Logic
      • 4.5.12 Performance Analysis
      • 4.5.13 Unfinished Business
    • 4.6 Summary
      • 4.6.1 Y86 Simulators
  • Optimizing Program Performance
    • 5.1 Capabilities and Limitations of Optimizing Compilers
    • 5.2 Expressing Program Performance
    • 5.3 Program Example
    • 5.4 Eliminating Loop Inefficiencies
    • 5.5 Reducing Procedure Calls
    • 5.6 Eliminating Unneeded Memory References
    • 5.7 Understanding Modern Processors
      • 5.7.1 Overall Operation
      • 5.7.2 Functional Unit Performance
      • 5.7.3 An Abstract Model of Processor Operation
    • 5.8 Loop Unrolling
    • 5.9 Enhancing Parallelism
      • 5.9.1 Multiple Accumulators
      • 5.9.2 Reassociation Transformation
    • 5.10 Summary of Results for Optimizing Combining Code
    • 5.11 Some Limiting Factors
      • 5.11.1 Register Spilling
      • 5.11.2 Branch Prediction and Misprediction Penalties
      • 5.12 Understanding Memory Performance
      • 5.12.1 Load Performance
      • 5.12.2 Store Performance
      • 5.13 Life in the Real World: Performance Improvement Techniques
      • 5.14 Identifying and Eliminating Performance Bottlenecks
      • 5.14.1 Program Profiling
      • 5.14.2 Using a Profiler to Guide Optimization
      • 5.14.3 Amdahl’s Law
    • 5.15 Summary
  • The Memory Hierarchy
    • 6.1 Storage Technologies
      • 6.1.1 Random-Access Memory
      • 6.1.2 Disk Storage
      • 6.1.3 Solid State Disks
      • 6.1.4 Storage Technology Trends
    • 6.2 Locality
      • 6.2.1 Locality of References to Program Data
      • 6.2.2 Locality of Instruction Fetches
      • 6.2.3 Summary of Locality
    • 6.3 The Memory Hierarchy
      • 6.3.1 Caching in the Memory Hierarchy
      • 6.3.2 Summary of Memory Hierarchy Concepts
    • 6.4 Cache Memories
      • 6.4.1 Generic Cache Memory Organization
      • 6.4.2 Direct-Mapped Caches
      • 6.4.3 Set Associative Caches
      • 6.4.4 Fully Associative Caches
      • 6.4.5 Issues with Writes
      • 6.4.6 Anatomy of a Real Cache Hierarchy
      • 6.4.7 Performance Impact of Cache Parameters
    • 6.5 Writing Cache-friendly Code
    • 6.6 Putting It Together: The Impact of Caches on Program Performance
      • 6.6.1 The Memory Mountain
      • 6.6.2 Rearranging Loops to Increase Spatial Locality
      • 6.6.3 Exploiting Locality in Your Programs
    • 6.7 Summary
  • Linking
    • 7.1 Compiler Drivers
    • 7.2 Static Linking
    • 7.3 Object Files
    • 7.4 Relocatable Object Files
    • 7.5 Symbols and Symbol Tables
    • 7.6 Symbol Resolution
      • 7.6.1 How Linkers Resolve Multiply Defined Global Symbols
      • 7.6.2 Linking with Static Libraries
      • 7.6.3 How Linkers Use Static Libraries to Resolve References
    • 7.7 Relocation
      • 7.7.1 Relocation Entries
      • 7.7.2 Relocating Symbol References
    • 7.8 Executable Object Files
    • 7.9 Loading Executable Object Files
    • 7.10 Dynamic Linking with Shared Libraries
    • 7.11 Loading and Linking Shared Libraries from Applications
    • 7.12 Position-Independent Code (PIC)
    • 7.13 Tools for Manipulating Object Files
    • 7.14 Summary
  • Exceptional Control Flow
    • 8.1 Exceptions
      • 8.1.1 Exception Handling
      • 8.1.2 Classes of Exceptions
      • 8.1.3 Exceptions in Linux/IA32 Systems
    • 8.2 Processes
      • 8.2.1 Logical Control Flow
      • 8.2.2 Concurrent Flows
      • 8.2.3 Private Address Space
      • 8.2.4 User and Kernel Modes
      • 8.2.5 Context Switches
    • 8.3 System Call Error Handling
    • 8.4 Process Control
      • 8.4.1 Obtaining Process IDs
      • 8.4.2 Creating and Terminating Processes
      • 8.4.3 Reaping Child Processes
      • 8.4.4 Putting Processes to Sleep
      • 8.4.5 Loading and Running Programs
      • 8.4.6 Using fork and execve to Run Programs
    • 8.5 Signals
      • 8.5.1 Signal Terminology
      • 8.5.2 Sending Signals
      • 8.5.3 Receiving Signals
      • 8.5.4 Signal Handling Issues
      • 8.5.5 Portable Signal Handling
      • 8.5.6 Explicitly Blocking and Unblocking Signals
      • 8.5.7 Synchronizing Flows to Avoid Nasty Concurrency Bugs
    • 8.6 Nonlocal Jumps
    • 8.7 Tools for Manipulating Processes
    • 8.8 Summary
  • Virtual Memory
    • 9.1 Physical and Virtual Addressing
    • 9.2 Address Spaces
    • 9.3 VM as a Tool for Caching
      • 9.3.1 DRAM Cache Organization
      • 9.3.2 Page Tables
      • 9.3.3 Page Hits
      • 9.3.4 Page Faults
      • 9.3.5 Allocating Pages
      • 9.3.6 Locality to the Rescue Again
    • 9.4 VM as a Tool for Memory Management
    • 9.5 VM as a Tool for Memory Protection
    • 9.6 Address Translation
      • 9.6.1 Integrating Caches and VM
      • 9.6.2 Speeding up Address Translation with a TLB
      • 9.6.3 Multi-Level Page Tables
      • 9.6.4 Putting It Together: End-to-end Address Translation
    • 9.7 Case Study: The Intel Core i7/Linux Memory System
      • 9.7.1 Core i7 Address Translation
      • 9.7.2 Linux Virtual Memory System
    • 9.8 Memory Mapping
      • 9.8.1 Shared Objects Revisited
      • 9.8.2 The fork Function Revisited
      • 9.8.3 The execve Function Revisited
      • 9.8.4 User-level Memory Mapping with the mmap Function
    • 9.9 Dynamic Memory Allocation
      • 9.9.1 The malloc and free Functions
      • 9.9.2 Why Dynamic Memory Allocation?
      • 9.9.3 Allocator Requirements and Goals
      • 9.9.4 Fragmentation
      • 9.9.5 Implementation Issues
      • 9.9.6 Implicit Free Lists
      • 9.9.7 Placing Allocated Blocks
      • 9.9.8 Splitting Free Blocks
      • 9.9.9 Getting Additional Heap Memory
      • 9.9.10 Coalescing Free Blocks
      • 9.9.11 Coalescing with Boundary Tags
      • 9.9.12 Putting It Together: Implementing a Simple Allocator
      • 9.9.13 Explicit Free Lists
      • 9.9.14 Segregated Free Lists
      • 9.10 Garbage Collection
      • 9.10.1 Garbage Collector Basics
      • 9.10.2 Mark&Sweep Garbage Collectors
      • 9.10.3 Conservative Mark&Sweep for C Programs
    • 9.11 Common Memory-Related Bugs in C Programs
      • 9.11.1 Dereferencing Bad Pointers
      • 9.11.2 Reading Uninitialized Memory
      • 9.11.3 Allowing Stack Buffer Overflows
      • 9.11.4 Assuming that Pointers and the Objects They Point to Are the Same Size
      • 9.11.5 Making Off-by-One Errors
      • 9.11.6 Referencing a Pointer Instead of the Object It Points to
      • 9.11.7 Misunderstanding Pointer Arithmetic
      • 9.11.8 Referencing Nonexistent Variables
      • 9.11.9 Referencing Data in Free Heap Blocks
      • 9.11.10 Introducing Memory Leaks
    • 9.12 Summary
  • System-Level I/O
    • 10.1 Unix I/O
    • 10.2 Opening and Closing Files
    • 10.3 Reading and Writing Files
    • 10.4 Robust Reading and Writing with the Rio Package
      • 10.4.1 Rio Unbuffered Input and Output Functions
      • 10.4.2 Rio Buffered Input Functions
    • 10.5 Reading File Metadata
    • 10.6 Sharing Files
    • 10.7 I/O Redirection
    • 10.8 Standard I/O
    • 10.9 Putting It Together: Which I/O Functions Should I Use?
    • 10.10 Summary
  • Network Programming
    • 11.1 The Client-Server Programming Model
    • 11.2 Networks
    • 11.3 The Global IP Internet
      • 11.3.1 IP Addresses
      • 11.3.2 Internet Domain Names
      • 11.3.3 Internet Connections
    • 11.4 The Sockets Interface
      • 11.4.1 Socket Address Structures
      • 11.4.2 The socket Function
      • 11.4.3 The connect Function
      • 11.4.4 The open_clientfd Function
      • 11.4.5 The bind Function
      • 11.4.6 The listen Function
      • 11.4.7 The open_listenfd Function
      • 11.4.8 The accept Function
      • 11.4.9 Example Echo Client and Server
    • 11.5 Web Servers
      • 11.5.1 Web Basics
      • 11.5.2 Web Content
      • 11.5.3 HTTP Transactions
      • 11.5.4 Serving Dynamic Content
    • 11.6 Putting It Together: The Tiny Web Server
    • 11.7 Summary
  • Concurrent Programming
    • 12.1 Concurrent Programming with Processes
      • 12.1.1 A Concurrent Server Based on Processes
      • 12.1.2 Pros and Cons of Processes
    • 12.2 Concurrent Programming with I/O Multiplexing
      • 12.2.1 A Concurrent Event-Driven Server Based on I/O Multiplexing
      • 12.2.2 Pros and Cons of I/O Multiplexing
    • 12.3 Concurrent Programming with Threads
      • 12.3.1 Thread Execution Model
      • 12.3.2 Posix Threads
      • 12.3.3 Creating Threads
      • 12.3.4 Terminating Threads
      • 12.3.5 Reaping Terminated Threads
      • 12.3.6 Detaching Threads
      • 12.3.7 Initializing Threads
      • 12.3.8 A Concurrent Server Based on Threads
    • 12.4 Shared Variables in Threaded Programs
      • 12.4.1 Threads Memory Model
      • 12.4.2 Mapping Variables to Memory
      • 12.4.3 Shared Variables
    • 12.5 Synchronizing Threads with Semaphores
      • 12.5.1 Progress Graphs
      • 12.5.2 Semaphores
      • 12.5.3 Using Semaphores for Mutual Exclusion
      • 12.5.4 Using Semaphores to Schedule Shared Resources
      • 12.5.5 Putting It Together: A Concurrent Server Based on
    • 12.6 Using Prethreading
  • Threads for Parallelism
    • 12.7 Other
      • 12.7.1 Thread Safety Concurrency Issues
      • 12.7.2 Reentrancy
      • 12.7.3 Using Existing Library Functions in Threaded Programs
      • 12.7.4 Races
      • 12.7.5 Deadlocks
    • 12.8 Summary 988 Bibliographic Notes
  • Error Handling
    • A.1 Error Handling in Unix Systems
    • A.2 Error-Handling Wrappers