From "Designing Data-Intensive Applications"
🎧 Listen to Summary
Free 10-min PreviewImplementing Serializable Isolation
Key Insight
Achieving serializable isolation, which ensures that concurrent transactions behave as if executed sequentially, is crucial for preventing all race conditions but presents significant implementation challenges. There are three primary techniques: actual serial execution, two-phase locking (2PL), and serializable snapshot isolation (SSI). Actual serial execution, where transactions run one at a time on a single thread, is feasible for short, in-memory OLTP transactions. To minimize network latency, transactions are often encapsulated in stored procedures submitted to the database, though this approach limits write throughput to a single CPU core unless data can be partitioned for single-partition transactions.
Two-phase locking (2PL) has been the traditional method for serializability for about 30 years, operating as a pessimistic concurrency control mechanism. It requires transactions to acquire shared locks for reads and exclusive locks for writes, holding these locks until the transaction commits or aborts. This means writers block readers and vice-versa, significantly reducing concurrency. While 2PL effectively prevents lost updates, write skew, and phantoms (using predicate or index-range locks), it suffers from lower throughput, higher contention, more frequent deadlocks, and unpredictable latencies, often causing the system to slow down when a long-running or high-contention transaction occurs.
Serializable snapshot isolation (SSI) is a more recent, optimistic concurrency control technique offering full serializability with only a small performance overhead compared to snapshot isolation. SSI allows transactions to proceed without blocking, hoping for no conflicts. At commit time, the database checks if isolation was violated and aborts the transaction if necessary. SSI detects stale MVCC reads and writes that affect prior reads by tracking transaction IDs and index entry access. Its key advantage is that it avoids blocking, resulting in more predictable and stable query latencies, especially for read-heavy workloads. SSI, used in PostgreSQL 9.1+ and FoundationDB, shows promise for scaling serializable isolation beyond the limits of single-CPU cores or 2PL's contention issues, though short read-write transactions are still preferred to minimize abort rates.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.