From "Designing Data-Intensive Applications"
🎧 Listen to Summary
Free 10-min PreviewConsistency Models: Eventual and Linearizability
Key Insight
Replicated databases often display different data on different nodes simultaneously due to varying write arrival times, irrespective of the replication method (single-leader, multi-leader, or leaderless). Most replicated databases offer at least 'eventual consistency,' a weak guarantee implying that if writes cease for an unspecified period, all read requests will eventually return the same value, with temporary inconsistencies eventually resolving. However, this guarantee is weak as it provides no timing for convergence, meaning reads might return any value or no value until consistency is achieved. For instance, a client might write a value and immediately read an old value if the read is routed to a different replica.
Eventual consistency presents challenges for application developers because its behavior differs significantly from variables in a typical single-threaded program, where writes are immediately visible. Bugs stemming from eventual consistency's limitations are often subtle and hard to detect via testing, typically becoming apparent only during system faults (e.g., network interruptions) or under high concurrency. Stronger consistency models can simplify application development by providing more predictable behavior, though they may lead to worse performance or reduced fault tolerance. These distributed consistency models primarily focus on coordinating replica state amidst delays and faults, distinguishing them from transaction isolation levels, which address race conditions in concurrent transactions.
Linearizability, also known as atomic, strong, immediate, or external consistency, is one of the strongest consistency models in common use. Its core idea is to create the illusion that there is only a single copy of the data, with all operations on it appearing atomic. This ensures that every client perceives the same, unified view of the data, eliminating concerns about replication lag. Linearizability is a 'recency guarantee': as soon as one client successfully completes a write, all subsequent read requests from any client must see that newly written value, ensuring data is always the most recent and not from a stale cache or replica. For example, if Alice sees the final score of the 2014 FIFA World Cup and tells Bob, but Bob immediately reloads his phone and sees the game still ongoing, it is a violation of linearizability, as Bob expects a result at least as recent as Alice's. A linearizable system must also ensure that once a new value is observed in a read, subsequent reads do not revert to an older value.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.