From "Fundamentals of Software Architecture"
🎧 Listen to Summary
Free 10-min PreviewCaching Models, Data Management, and Collisions in Space-Based Architecture
Key Insight
Space-based architecture predominantly employs replicated caching, where each processing unit contains an in-memory data grid automatically synchronized across all units sharing the same named cache. This model offers high performance, robust fault tolerance, and avoids a single point of failure, making it suitable for small caches (under 100 megabytes) and low update rates of relatively static data. Conversely, for large caches (over 500 megabytes) or high update rates of dynamic data, distributed caching is utilized, requiring an external centralized cache server. While distributed caching ensures high data consistency, it introduces remote access latency and a single point of failure. A hybrid near-cache model, consisting of a distributed 'full backing cache' and smaller, unsynchronized 'front caches' within processing units, is not recommended due to resulting inconsistent performance across units. The choice between replicated and distributed caching depends on balancing data consistency needs against performance and fault tolerance requirements.
The architecture uses specific components for data management, as processing units do not directly access databases. Data pumps are essential asynchronous messaging components that send updates from processing units (which own the updates) to the database, ensuring eventual consistency. They support guaranteed delivery and ordered messaging (FIFO queueing) and decouple processing units from data writers. Data pumps handle contracts like JSON or XML schemas, often sending only the new data values (e.g., a changed phone number along with customer ID). Data writers accept messages from data pumps to update the database, and can be domain-based (handling all updates for a domain like 'customer') or dedicated per processing unit class for improved scalability. Data readers fetch data from the database, sending it to processing units via a 'reverse data pump' when all cache instances crash, are redeployed, or for archive data. Together, data writers and readers form a data abstraction layer, decoupling processing units from underlying database structures and allowing incremental database changes through transformation logic.
Data collisions are a potential issue in active/active replicated caching, occurring when the same data is updated concurrently in different cache instances due to replication latency. For example, if Service A updates a blue widget count to 490 units while Service B updates it to 495 units, both caches might eventually settle on incorrect, out-of-sync values (e.g., 490 and 495 instead of 485 total). The probability of collisions is determined by a formula: `Collision Rate = N * UR^2 / S * RL`, where N is the number of service instances, UR is the update rate squared, S is the cache size in rows, and RL is the replication latency. With 5 instances, 20 updates/second, 50000 rows, and 100 milliseconds latency, 72000 hourly updates yield approximately 14.4 collisions (0.02%). Reducing replication latency to 1 millisecond decreases collisions to about 0.1 per hour, while reducing cache size to 10000 rows increases collisions to 72 per hour. Replication latency, ideally derived from production measurements (100 milliseconds is a common planning number), significantly impacts data consistency, and understanding peak update rates is crucial for collision rate calculations.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Fundamentals of Software Architecture summary with audio narration, key takeaways, and actionable insights from Mark Richards, Neal Ford.