Cover of Designing Data-Intensive Applications by Martin Kleppmann - Business and Economics Book

From "Designing Data-Intensive Applications"

Author: Martin Kleppmann
Publisher: "O'Reilly Media, Inc."
Year: 2017
Category: Computers

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

🎧 Listen to Summary

Free 10-min Preview
0:00
Speed:
10:00 free remaining
Chapter 12: The Future of Data Systems
Key Insight 2 from this chapter

Managing Consistency and Event Ordering in Distributed Systems

Key Insight

When multiple storage systems maintain copies of the same data to serve different access patterns, it is critical to clearly define inputs and outputs, establishing which data representations are derived from which sources. A recommended approach involves initially writing data to a system of record database, capturing all changes made to it through change data capture (CDC), and then applying these changes to other derived systems, such as a search index, in the precise order they occurred. This ensures the derived systems remain consistent with the system of record. Conversely, allowing an application to directly write to both a search index and a database concurrently can lead to permanent inconsistencies if conflicting writes are processed in different orders by each system.

To mitigate such inconsistencies, all user input should ideally be funneled through a single system that establishes a total ordering for all writes. This principle, akin to state machine replication or total order broadcast, significantly simplifies the process of deriving other data representations by applying changes in the determined sequence. Log-based derived data systems, in particular, often enable deterministic and idempotent updates, which greatly enhances fault recovery. While distributed transactions, such as those using two-phase commit (2PC) with two-phase locking (2PL), also aim for consistency by enforcing an order and atomic commits, protocols like XA have poor fault tolerance and performance, making log-based derived data a more promising integration strategy for diverse data systems, despite typically offering only asynchronous updates rather than linearizability.

However, constructing a globally totally ordered event log faces limitations as systems scale. If event throughput exceeds a single machine's capacity, partitioning becomes necessary, leading to an ambiguous order of events across different partitions. In geographically distributed datacenters, network delays necessitate separate leaders in each location, resulting in an undefined ordering for events originating from different datacenters. Similarly, microservices, which often manage their durable state independently, and client-side applications that operate offline, frequently encounter situations where events from different sources are observed in varying orders. Most consensus algorithms, which enable total order broadcast, are designed for single-node throughput and do not offer mechanisms for multiple nodes to collaboratively order events in scalable or geographically distributed settings, a challenge that remains an open research problem, especially when preserving subtle causal dependencies between events.

📚 Continue Your Learning Journey — No Payment Required

Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.