Cover of Designing Data-Intensive Applications by Martin Kleppmann - Business and Economics Book

From "Designing Data-Intensive Applications"

Author: Martin Kleppmann
Publisher: "O'Reilly Media, Inc."
Year: 2017
Category: Computers

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

🎧 Listen to Summary

Free 10-min Preview
0:00
Speed:
10:00 free remaining
Chapter 4: Encoding and Evolution
Key Insight 5 from this chapter

Diverse Dataflow Patterns and Compatibility Requirements

Key Insight

Dataflow through databases involves a writing process encoding data and a reading process decoding it. In environments with rolling upgrades or multiple applications accessing the same database, older and newer code versions often coexist. This necessitates both backward and forward compatibility for database schemas. A critical consideration is ensuring that older code reading, modifying, and rewriting data does not inadvertently discard newer fields it doesn't understand, highlighting the 'data outlives code' principle where stored data persists in its original encoding long after application code changes. Schema evolution allows a database to functionally appear as if it uses a single schema, even with mixed historical encodings.

Service-oriented architectures (SOAs) and microservices rely on dataflow through service calls, commonly implemented via REST or RPC. Clients encode requests, servers decode and process them, then encode responses for clients to decode. Independent deployability of services is a key goal, requiring robust compatibility between different versions of service APIs. RPC attempts to abstract network communication as local function calls (location transparency), but this abstraction is fundamentally flawed due to inherent differences: network calls are unpredictable (prone to timeouts, retries, and variable latency) and require data serialization, unlike predictable, fast local calls with direct object references. Datatype translation across different programming languages also poses challenges for RPC frameworks.

Asynchronous message-passing systems, using message brokers (queues) or distributed actor frameworks, offer another dataflow pattern. Message brokers enhance reliability by buffering messages, redelivering upon crashes, and decoupling senders from recipients, facilitating one-to-many communication. Communication is typically one-way and asynchronous. Distributed actor frameworks extend this message-passing model across nodes, transparently encoding and decoding messages over the network. For rolling upgrades in these systems, backward and forward compatibility of message encodings is vital to ensure different application versions can communicate seamlessly. This often means replacing default language-specific serialization with more compatible formats like Protocol Buffers.

📚 Continue Your Learning Journey — No Payment Required

Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.