Cover of Designing Data-Intensive Applications by Martin Kleppmann - Business and Economics Book

From "Designing Data-Intensive Applications"

Author: Martin Kleppmann
Publisher: "O'Reilly Media, Inc."
Year: 2017
Category: Computers

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

🎧 Listen to Summary

Free 10-min Preview
0:00
Speed:
10:00 free remaining
Chapter 5: Replication
Key Insight 6 from this chapter

Multi-Leader Replication and Conflict Resolution

Key Insight

Multi-leader replication, also known as master-master or active/active, extends the leader-based model by allowing multiple nodes to accept write requests simultaneously. In this configuration, each leader acts as both a primary for local writes and a follower to other leaders, asynchronously propagating changes across the system. While inherently more complex, particularly within a single datacenter, this model offers distinct advantages for specific use cases where a single point of write entry is restrictive.

Primary applications for multi-leader replication include multi-datacenter deployments, where a leader in each location processes local writes, reducing latency and enhancing tolerance to datacenter or network outages. Similarly, client applications requiring offline capabilities, such as mobile calendar apps, operate with a local leader that syncs changes when connectivity is restored. Real-time collaborative editing tools, like Google Docs, also leverage this model, applying local edits instantly and asynchronously replicating them to other users, rather than imposing locks.

The most significant challenge in multi-leader setups is resolving write conflicts that arise when the same data is concurrently modified on different leaders. Unlike single-leader systems where conflicts are avoided by blocking or aborting, multi-leader systems detect conflicts asynchronously, requiring a convergent resolution strategy to ensure all replicas eventually agree on a final state. Strategies range from simple, yet data-loss prone, 'last write wins' (LWW) based on timestamps or unique IDs, to more sophisticated approaches like merging values (e.g., unions for shopping carts) or explicitly recording conflicts for application-level resolution. Research into 'Conflict-Free Replicated Datatypes' (CRDTs) and operational transformation aims to automate and simplify this complex process. Different replication topologies, such as all-to-all, are employed, but can introduce challenges like causal ordering violations if not carefully managed with mechanisms like version vectors.

📚 Continue Your Learning Journey — No Payment Required

Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.