From "Designing Data-Intensive Applications"
🎧 Listen to Summary
Free 10-min PreviewAdvanced Concurrency Anomalies: Lost Updates, Write Skew, and Phantoms
Key Insight
Beyond dirty writes, transactions can encounter other complex concurrency conflicts. The 'lost update problem' occurs when two transactions concurrently perform a read-modify-write cycle, and one transaction's update is overwritten by the other without incorporating its changes, leading to lost data. Examples include two clients concurrently incrementing a counter, or multiple users simultaneously editing a wiki page where a later save overwrites an earlier one. Solutions include database-provided atomic update operations, explicit application-level locking using `SELECT ... FOR UPDATE`, or automatic lost update detection found in some snapshot isolation implementations, such as PostgreSQL's repeatable read.
A more subtle anomaly is 'write skew,' which generalizes lost updates. This occurs when two transactions read the same objects, make decisions based on those reads, and then update *different* objects, causing an application-level invariant to be violated. A classic example is two on-call doctors simultaneously deciding to go off-call, both initially seeing '2 doctors on call,' but their concurrent, separate updates result in '0 doctors on call,' violating the 'at least one doctor' rule. Atomic single-object operations or automatic lost update detection typically do not prevent write skew; true serializable isolation is usually required.
'Phantoms' arise when a transaction's write changes the result set of a search query in another transaction, often by inserting a new row that matches a condition previously checked as absent. For example, a meeting room booking system checks for conflicting bookings, finds none, and inserts a new booking; a concurrent transaction might do the same, leading to a double-booking because the absence check became stale. While snapshot isolation prevents simple phantom reads for read-only queries, read-write transactions are vulnerable to phantoms causing write skew. Strategies like 'materializing conflicts' by introducing artificial lock objects are a complex last resort to address phantoms when serializable isolation is not used.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.