From "Designing Data-Intensive Applications"
🎧 Listen to Summary
Free 10-min PreviewDistinction Between OLTP and OLAP Workloads and Data Warehousing
Key Insight
Database usage has evolved significantly from its early focus on 'transaction processing,' where 'transaction' refers to a logical unit of reads and writes, often involving commercial activities. This interactive, low-latency pattern is now known as Online Transaction Processing (OLTP). OLTP applications typically process a high volume of requests, each involving a small number of records retrieved by key using an index. These systems prioritize high availability and low latency, with disk seek time frequently being the performance bottleneck due to numerous random access operations.
In contrast, databases are increasingly utilized for data analytics, termed Online Analytical Processing (OLAP). OLAP queries usually scan vast numbers of records, often focusing on only a few columns, to compute aggregate statistics like sums or averages, rather than returning raw data. These queries are typically executed by business analysts for decision support. Key differentiators include OLTP's small, key-based reads and random-access writes of current data versus OLAP's large-scale aggregations, bulk data imports (Extract, Transform, Load - ETL), and historical event analysis, where disk bandwidth, not seek time, becomes the primary bottleneck.
To prevent resource-intensive analytical queries from impacting critical OLTP systems, large enterprises commonly employ a separate 'data warehouse' for OLAP. This warehouse contains a read-only copy of data from various OLTP systems. Data is extracted, transformed into an analysis-friendly schema, cleaned, and loaded (ETL) into the warehouse. Data warehouses are optimized for analytical access patterns, often employing a relational model with 'star schemas' or 'snowflake schemas'. Fact tables, representing individual events (e.g., customer purchases), link to 'dimension tables' (e.g., product, date, store), enabling flexible querying across petabytes of historical data for business intelligence.
📚 Continue Your Learning Journey — No Payment Required
Access the complete Designing Data-Intensive Applications summary with audio narration, key takeaways, and actionable insights from Martin Kleppmann.