Computer Systems A Programmers Perspective
Modules
Chapter List
- A Tour of Computer Systems
- 1.1 Information Is Bits + Context
- 1.2 Programs Are Translated by Other Programs into Different Forms
- 1.3 It Pays to Understand How Compilation Systems Work
- 1.4 Processors Read and Interpret Instructions Stored in Memory
- 1.4.1 Hardware Organization of a System
- 1.4.2 Running the hello Program
- 1.5 Caches Matter
- 1.6 Storage Devices Form a Hierarchy
- 1.7 The Operating System Manages the Hardware
- 1.7.1 Processes
- 1.7.2 Threads
- 1.7.3 Virtual Memory
- 1.7.4 Files
- 1.8 Systems Communicate with Other Systems Using Networks
- 1.9 Important Themes
- 1.9.1 Concurrency and Parallelism
- 1.9.2 The Importance of Abstractions in Computer Systems
- 1.10 Summary
- Representing and Manipulating Information
- 2.1 Information Storage
- 2.1.1 Hexadecimal Notation
- 2.1.2 Words
- 2.1.3 Data Sizes
- 2.1.4 Addressing and Byte Ordering
- 2.1.5 Representing Strings
- 2.1.6 Representing Code
- 2.1.7 Introduction to Boolean Algebra
- 2.1.8 Bit-Level Operations in C
- 2.1.9 Logical Operations in C
- 2.1.10 Shift Operations in C
- 2.2 Integer Representations
- 2.2.1 Integral Data Types
- 2.2.2 Unsigned Encodings
- 2.2.3 Two’s-Complement Encodings
- 2.2.4 Conversions Between Signed and Unsigned
- 2.2.5 Signed vs. Unsigned in C
- 2.2.6 Expanding the Bit Representation of a Number
- 2.2.7 Truncating Numbers
- 2.2.8 Advice on Signed vs. Unsigned
- 2.3 Integer Arithmetic
- 2.3.1 Unsigned Addition
- 2.3.2 Two’s-Complement Addition
- 2.3.3 Two’s-Complement Negation
- 2.3.4 Unsigned Multiplication
- 2.3.5 Two’s-Complement Multiplication
- 2.3.6 Multiplying by Constants
- 2.3.7 Dividing by Powers of Two
- 2.3.8 Final Thoughts on Integer Arithmetic
- 2.4 Floating Point
- 2.4.1 Fractional Binary Numbers
- 2.4.2 IEEE Floating-Point Representation
- 2.4.3 Example Numbers
- 2.4.4 Rounding
- 2.4.5 Floating-Point Operations
- 2.4.6 Floating Point in C
- 2.5 Summary
- 2.1 Information Storage
- Machine-Level Representation of Programs
- 3.1 A Historical Perspective
- 3.2 Program Encodings
- 3.2.1 Machine-Level Code
- 3.2.2 Code Examples
- 3.2.3 Notes on Formatting 3.3 Data Formats
- 3.4 Accessing Information
- 3.4.1 Operand Specifiers
- 3.4.2 Data Movement Instructions
- 3.4.3 Data Movement Example
- 3.5 Arithmetic and Logical Operations
- 3.5.1 Load Effective Address
- 3.5.2 Unary and Binary Operations
- 3.5.3 Shift Operations
- 3.5.4 Discussion
- 3.5.5 Special Arithmetic Operations
- 3.6 Control
- 3.6.1 Condition Codes
- 3.6.2 Accessing the Condition Codes
- 3.6.3 Jump Instructions and Their Encodings
- 3.6.4 Translating Conditional Branches
- 3.6.5 Loops
- 3.6.6 Conditional Move Instructions
- 3.6.7 Switch Statements
- 3.7 Procedures
- 3.7.1 Stack Frame Structure
- 3.7.2 Transferring Control
- 3.7.3 Register Usage Conventions
- 3.7.4 Procedure Example
- 3.7.5 Recursive Procedures
- 3.8 Array Allocation and Access
- 3.8.1 Basic Principles
- 3.8.2 Pointer Arithmetic
- 3.8.3 Nested Arrays
- 3.8.4 Fixed-Size Arrays
- 3.8.5 Variable-Size Arrays
- 3.9 Heterogeneous Data Structures
- 3.9.1 Structures
- 3.9.2 Unions
- 3.9.3 Data Alignment
- 3.10 Putting It Together: Understanding Pointers
- 3.11 Life in the Real World: Using the gdb Debugger
- 3.12 Out-of-Bounds Memory References and Buffer Overflow
- 3.12.1 Thwarting Buffer Overflow Attacks
- 3.13 x86-64: Extending IA32 to 64 Bits
- 3.13.1 History and Motivation for x86-64
- 3.13.2 An Overview of x86-64
- 3.13.3 Accessing Information
- 3.13.4 Control
- 3.13.5 Data Structures
- 3.13.6 Concluding Observations about x86-64
- 3.14 Machine-Level Representations of Floating-Point Programs
- 3.15 Summary
- Bibliographic Notes
- Processor Architecture
- 4.1 The Y86 Instruction Set Architecture
- 4.1.1 Programmer-Visible State
- 4.1.2 Y86 Instructions
- 4.1.3 Instruction Encoding
- 4.1.4 Y86 Exceptions
- 4.1.5 Y86 Programs
- 4.1.6 Some Y86 Instruction Details
- 4.2 Logic Design and the Hardware Control Language HCL
- 4.2.1 Logic Gates
- 4.2.2 Combinational Circuits and HCL Boolean Expressions
- 4.2.3 Word-Level Combinational Circuits and HCL Integer Expressions
- 4.2.4 Set Membership
- 4.2.5 Memory and Clocking
- 4.3 Sequential Y86 Implementations
- 4.3.1 Organizing Processing into Stages
- 4.3.2 SEQ Hardware Structure
- 4.3.3 SEQ Timing
- 4.3.4 SEQ Stage Implementations
- 4.4 General Principles of Pipelining
- 4.4.1 Computational Pipelines
- 4.4.2 A Detailed Look at Pipeline Operation
- 4.4.3 Limitations of Pipelining
- 4.4.4 Pipelining a System with Feedback
- 4.5 Pipelined Y86 Implementations
- 4.5.1 SEQ+: Rearranging the Computation Stages
- 4.5.2 Inserting Pipeline Registers
- 4.5.3 Rearranging and Relabeling Signals
- 4.5.4 Next PC Prediction
- 4.5.5 Pipeline Hazards
- 4.5.6 Avoiding Data Hazards by Stalling
- 4.5.7 Avoiding Data Hazards by Forwarding
- 4.5.8 Load/Use Data Hazards
- 4.5.9 Exception Handling
- 4.5.10 PIPE Stage Implementations
- 4.5.11 Pipeline Control Logic
- 4.5.12 Performance Analysis
- 4.5.13 Unfinished Business
- 4.6 Summary
- 4.6.1 Y86 Simulators
- 4.1 The Y86 Instruction Set Architecture
- Optimizing Program Performance
- 5.1 Capabilities and Limitations of Optimizing Compilers
- 5.2 Expressing Program Performance
- 5.3 Program Example
- 5.4 Eliminating Loop Inefficiencies
- 5.5 Reducing Procedure Calls
- 5.6 Eliminating Unneeded Memory References
- 5.7 Understanding Modern Processors
- 5.7.1 Overall Operation
- 5.7.2 Functional Unit Performance
- 5.7.3 An Abstract Model of Processor Operation
- 5.8 Loop Unrolling
- 5.9 Enhancing Parallelism
- 5.9.1 Multiple Accumulators
- 5.9.2 Reassociation Transformation
- 5.10 Summary of Results for Optimizing Combining Code
- 5.11 Some Limiting Factors
- 5.11.1 Register Spilling
- 5.11.2 Branch Prediction and Misprediction Penalties
- 5.12 Understanding Memory Performance
- 5.12.1 Load Performance
- 5.12.2 Store Performance
- 5.13 Life in the Real World: Performance Improvement Techniques
- 5.14 Identifying and Eliminating Performance Bottlenecks
- 5.14.1 Program Profiling
- 5.14.2 Using a Profiler to Guide Optimization
- 5.14.3 Amdahl’s Law
- 5.15 Summary
- The Memory Hierarchy
- 6.1 Storage Technologies
- 6.1.1 Random-Access Memory
- 6.1.2 Disk Storage
- 6.1.3 Solid State Disks
- 6.1.4 Storage Technology Trends
- 6.2 Locality
- 6.2.1 Locality of References to Program Data
- 6.2.2 Locality of Instruction Fetches
- 6.2.3 Summary of Locality
- 6.3 The Memory Hierarchy
- 6.3.1 Caching in the Memory Hierarchy
- 6.3.2 Summary of Memory Hierarchy Concepts
- 6.4 Cache Memories
- 6.4.1 Generic Cache Memory Organization
- 6.4.2 Direct-Mapped Caches
- 6.4.3 Set Associative Caches
- 6.4.4 Fully Associative Caches
- 6.4.5 Issues with Writes
- 6.4.6 Anatomy of a Real Cache Hierarchy
- 6.4.7 Performance Impact of Cache Parameters
- 6.5 Writing Cache-friendly Code
- 6.6 Putting It Together: The Impact of Caches on Program Performance
- 6.6.1 The Memory Mountain
- 6.6.2 Rearranging Loops to Increase Spatial Locality
- 6.6.3 Exploiting Locality in Your Programs
- 6.7 Summary
- 6.1 Storage Technologies
- Linking
- 7.1 Compiler Drivers
- 7.2 Static Linking
- 7.3 Object Files
- 7.4 Relocatable Object Files
- 7.5 Symbols and Symbol Tables
- 7.6 Symbol Resolution
- 7.6.1 How Linkers Resolve Multiply Defined Global Symbols
- 7.6.2 Linking with Static Libraries
- 7.6.3 How Linkers Use Static Libraries to Resolve References
- 7.7 Relocation
- 7.7.1 Relocation Entries
- 7.7.2 Relocating Symbol References
- 7.8 Executable Object Files
- 7.9 Loading Executable Object Files
- 7.10 Dynamic Linking with Shared Libraries
- 7.11 Loading and Linking Shared Libraries from Applications
- 7.12 Position-Independent Code (PIC)
- 7.13 Tools for Manipulating Object Files
- 7.14 Summary
- Exceptional Control Flow
- 8.1 Exceptions
- 8.1.1 Exception Handling
- 8.1.2 Classes of Exceptions
- 8.1.3 Exceptions in Linux/IA32 Systems
- 8.2 Processes
- 8.2.1 Logical Control Flow
- 8.2.2 Concurrent Flows
- 8.2.3 Private Address Space
- 8.2.4 User and Kernel Modes
- 8.2.5 Context Switches
- 8.3 System Call Error Handling
- 8.4 Process Control
- 8.4.1 Obtaining Process IDs
- 8.4.2 Creating and Terminating Processes
- 8.4.3 Reaping Child Processes
- 8.4.4 Putting Processes to Sleep
- 8.4.5 Loading and Running Programs
- 8.4.6 Using fork and execve to Run Programs
- 8.5 Signals
- 8.5.1 Signal Terminology
- 8.5.2 Sending Signals
- 8.5.3 Receiving Signals
- 8.5.4 Signal Handling Issues
- 8.5.5 Portable Signal Handling
- 8.5.6 Explicitly Blocking and Unblocking Signals
- 8.5.7 Synchronizing Flows to Avoid Nasty Concurrency Bugs
- 8.6 Nonlocal Jumps
- 8.7 Tools for Manipulating Processes
- 8.8 Summary
- 8.1 Exceptions
- Virtual Memory
- 9.1 Physical and Virtual Addressing
- 9.2 Address Spaces
- 9.3 VM as a Tool for Caching
- 9.3.1 DRAM Cache Organization
- 9.3.2 Page Tables
- 9.3.3 Page Hits
- 9.3.4 Page Faults
- 9.3.5 Allocating Pages
- 9.3.6 Locality to the Rescue Again
- 9.4 VM as a Tool for Memory Management
- 9.5 VM as a Tool for Memory Protection
- 9.6 Address Translation
- 9.6.1 Integrating Caches and VM
- 9.6.2 Speeding up Address Translation with a TLB
- 9.6.3 Multi-Level Page Tables
- 9.6.4 Putting It Together: End-to-end Address Translation
- 9.7 Case Study: The Intel Core i7/Linux Memory System
- 9.7.1 Core i7 Address Translation
- 9.7.2 Linux Virtual Memory System
- 9.8 Memory Mapping
- 9.8.1 Shared Objects Revisited
- 9.8.2 The fork Function Revisited
- 9.8.3 The execve Function Revisited
- 9.8.4 User-level Memory Mapping with the mmap Function
- 9.9 Dynamic Memory Allocation
- 9.9.1 The malloc and free Functions
- 9.9.2 Why Dynamic Memory Allocation?
- 9.9.3 Allocator Requirements and Goals
- 9.9.4 Fragmentation
- 9.9.5 Implementation Issues
- 9.9.6 Implicit Free Lists
- 9.9.7 Placing Allocated Blocks
- 9.9.8 Splitting Free Blocks
- 9.9.9 Getting Additional Heap Memory
- 9.9.10 Coalescing Free Blocks
- 9.9.11 Coalescing with Boundary Tags
- 9.9.12 Putting It Together: Implementing a Simple Allocator
- 9.9.13 Explicit Free Lists
- 9.9.14 Segregated Free Lists
- 9.10 Garbage Collection
- 9.10.1 Garbage Collector Basics
- 9.10.2 Mark&Sweep Garbage Collectors
- 9.10.3 Conservative Mark&Sweep for C Programs
- 9.11 Common Memory-Related Bugs in C Programs
- 9.11.1 Dereferencing Bad Pointers
- 9.11.2 Reading Uninitialized Memory
- 9.11.3 Allowing Stack Buffer Overflows
- 9.11.4 Assuming that Pointers and the Objects They Point to Are the Same Size
- 9.11.5 Making Off-by-One Errors
- 9.11.6 Referencing a Pointer Instead of the Object It Points to
- 9.11.7 Misunderstanding Pointer Arithmetic
- 9.11.8 Referencing Nonexistent Variables
- 9.11.9 Referencing Data in Free Heap Blocks
- 9.11.10 Introducing Memory Leaks
- 9.12 Summary
- System-Level I/O
- 10.1 Unix I/O
- 10.2 Opening and Closing Files
- 10.3 Reading and Writing Files
- 10.4 Robust Reading and Writing with the Rio Package
- 10.4.1 Rio Unbuffered Input and Output Functions
- 10.4.2 Rio Buffered Input Functions
- 10.5 Reading File Metadata
- 10.6 Sharing Files
- 10.7 I/O Redirection
- 10.8 Standard I/O
- 10.9 Putting It Together: Which I/O Functions Should I Use?
- 10.10 Summary
- Network Programming
- 11.1 The Client-Server Programming Model
- 11.2 Networks
- 11.3 The Global IP Internet
- 11.3.1 IP Addresses
- 11.3.2 Internet Domain Names
- 11.3.3 Internet Connections
- 11.4 The Sockets Interface
- 11.4.1 Socket Address Structures
- 11.4.2 The socket Function
- 11.4.3 The connect Function
- 11.4.4 The open_clientfd Function
- 11.4.5 The bind Function
- 11.4.6 The listen Function
- 11.4.7 The open_listenfd Function
- 11.4.8 The accept Function
- 11.4.9 Example Echo Client and Server
- 11.5 Web Servers
- 11.5.1 Web Basics
- 11.5.2 Web Content
- 11.5.3 HTTP Transactions
- 11.5.4 Serving Dynamic Content
- 11.6 Putting It Together: The Tiny Web Server
- 11.7 Summary
- Concurrent Programming
- 12.1 Concurrent Programming with Processes
- 12.1.1 A Concurrent Server Based on Processes
- 12.1.2 Pros and Cons of Processes
- 12.2 Concurrent Programming with I/O Multiplexing
- 12.2.1 A Concurrent Event-Driven Server Based on I/O Multiplexing
- 12.2.2 Pros and Cons of I/O Multiplexing
- 12.3 Concurrent Programming with Threads
- 12.3.1 Thread Execution Model
- 12.3.2 Posix Threads
- 12.3.3 Creating Threads
- 12.3.4 Terminating Threads
- 12.3.5 Reaping Terminated Threads
- 12.3.6 Detaching Threads
- 12.3.7 Initializing Threads
- 12.3.8 A Concurrent Server Based on Threads
- 12.4 Shared Variables in Threaded Programs
- 12.4.1 Threads Memory Model
- 12.4.2 Mapping Variables to Memory
- 12.4.3 Shared Variables
- 12.5 Synchronizing Threads with Semaphores
- 12.5.1 Progress Graphs
- 12.5.2 Semaphores
- 12.5.3 Using Semaphores for Mutual Exclusion
- 12.5.4 Using Semaphores to Schedule Shared Resources
- 12.5.5 Putting It Together: A Concurrent Server Based on
- 12.6 Using Prethreading
- 12.1 Concurrent Programming with Processes
- Threads for Parallelism
- 12.7 Other
- 12.7.1 Thread Safety Concurrency Issues
- 12.7.2 Reentrancy
- 12.7.3 Using Existing Library Functions in Threaded Programs
- 12.7.4 Races
- 12.7.5 Deadlocks
- 12.8 Summary 988 Bibliographic Notes
- 12.7 Other
- Error Handling
- A.1 Error Handling in Unix Systems
- A.2 Error-Handling Wrappers
Backlinks
Computer Science Books
- [[computer-systems-a-programmers-perspective]]