Cover of Designing Data-Intensive Applications by Martin Kleppmann - Business and Economics Book

From "Designing Data-Intensive Applications"

Author: Martin Kleppmann
Publisher: "O'Reilly Media, Inc."
Year: 2017
Category: Computers

🎧 Free Preview Complete

You've listened to your free 10-minute preview.
Sign up free to continue listening to the full summary.

🎧 Listen to Summary

Free 10-min Preview
0:00
Speed:
10:00 free remaining
Chapter 3: Storage and Retrieval
Key Insight 5 from this chapter

Distinction Between OLTP and OLAP Workloads and Data Warehousing

Key Insight

Database usage has evolved significantly from its early focus on 'transaction processing,' where 'transaction' refers to a logical unit of reads and writes, often involving commercial activities. This interactive, low-latency pattern is now known as Online Transaction Processing (OLTP). OLTP applications typically process a high volume of requests, each involving a small number of records retrieved by key using an index. These systems prioritize high availability and low latency, with disk seek time frequently being the performance bottleneck due to numerous random access operations.

In contrast, databases are increasingly utilized for data analytics, termed Online Analytical Processing (OLAP). OLAP queries usually scan vast numbers of records, often focusing on only a few columns, to compute aggregate statistics like sums or averages, rather than returning raw data. These queries are typically executed by business analysts for decision support. Key differentiators include OLTP's small, key-based reads and random-access writes of current data versus OLAP's large-scale aggregations, bulk data imports (Extract, Transform, Load - ETL), and historical event analysis, where disk bandwidth, not seek time, becomes the primary bottleneck.

To prevent resource-intensive analytical queries from impacting critical OLTP systems, large enterprises commonly employ a separate 'data warehouse' for OLAP. This warehouse contains a read-only copy of data from various OLTP systems. Data is extracted, transformed into an analysis-friendly schema, cleaned, and loaded (ETL) into the warehouse. Data warehouses are optimized for analytical access patterns, often employing a relational model with 'star schemas' or 'snowflake schemas'. Fact tables, representing individual events (e.g., customer purchases), link to 'dimension tables' (e.g., product, date, store), enabling flexible querying across petabytes of historical data for business intelligence.

📚 Continue Your Learning Journey — No Payment Required

Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.