Chapter 3 - Designing Data-Intensive Applications

Syan S.P ยท December 10, 2024

Chapter 3 Storage and Retrieval

PART 1: Storage Engines & Log-Structured Approaches

1. Storage Engine Types

  • Log-Structured (LSM Trees): Optimized for writes using append-only, immutable segments.
  • Page-Oriented (B-Trees): Optimized for reads with update-in-place architecture.

2. Log-Structured Engines

  • Use append-only logs for persistent storage.
  • Data is written to segment files and later compacted to remove duplicates and retain the latest version.
  • Hash Index: An in-memory map of key โ†’ offset in the file for fast lookup; requires sufficient RAM.
  • Tombstones are used to mark deletions instead of removing data immediately.

3. SSTables and LSM Trees

  • SSTables (Sorted String Tables): Store sorted key-value pairs in immutable files.
  • Use a memtable (an in-memory tree) to buffer writes before flushing to disk.
  • Sorted structure allows faster merging compared to unordered logs.
  • Bloom filters help avoid unnecessary lookups in segments.

Compaction Strategies

  • Size-tiered compaction: Merge smaller segments into larger ones.
  • Leveled compaction: Move data across levels while keeping non-overlapping key ranges per level.

PART 2: B-Trees and Index Comparisons

1. B-Trees

  • Store data in fixed-size pages.
  • Organized as a tree with root, internal nodes, and leaf nodes.
  • Well-suited for range queries and efficient reads.

2. B-Tree Operations

  • Use update-in-place for writes; may split pages when inserting.
  • Rely on write-ahead logs (WAL) for crash recovery.
  • Use latches (lightweight locks) to support concurrent access.

3. Optimizations

  • Sibling pointers on leaf nodes help with efficient range scans.
  • Some systems use copy-on-write instead of in-place updates for crash safety and consistency.
  • Systems like SQLite use B-trees for both table storage and indexes.

4. B-Trees vs LSM Trees

Feature B-Trees LSM Trees
Best for Read-heavy workloads Write-heavy workloads
Write Strategy Update-in-place Append-only with compaction
Storage Consistent performance Lower write amplification on SSDs
Compression Limited Typically better
Semantics Strong transactional semantics Eventual consistency more common

PART 3: Advanced Indexes, In-Memory Databases, and Analytics

1. Secondary and Multi-Column Indexes

  • Secondary Index: Multiple rows may share the same index key.

Storage Approaches

  • Heap file: Data stored separately from index.
  • Clustered index: Rows stored inline with index entries.
  • Covering index: Partial rows stored in index to optimize query performance.

Multi-Column Indexing

  • Concatenated index: Index built on multiple columns in a defined order.
  • Multi-dimensional index: Suitable for spatial or geo data; often use space-filling curves for mapping.
  • Use Levenshtein automata to support edit-distance and approximate text matching.

3. In-Memory Databases

  • RAM is affordable enough to keep the entire database in memory.

Durability Options

  • Logs, snapshots, battery-backed RAM, or replication.
  • Faster due to the absence of disk encoding/decoding overhead.
  • Can support advanced structures like sets and queues (e.g., Redis).

4. OLTP vs OLAP

  • OLTP (Online Transaction Processing): Focused on fast, frequent, small-scale writes and reads.
  • OLAP (Online Analytical Processing): Focused on large scans, aggregates, and historical analysis.
  • ETL pipelines are used to load OLTP data into data warehouses for OLAP to avoid performance impacts.

5. Column-Oriented Storage

  • Store data by column instead of by row.
  • Ideal for analytics: read fewer columns, compress better.

Compression Strategies

  • Bitmap encoding
  • Dictionary encoding

  • Sort order improves filtering and enhances compression ratios.

6. Writes in Column Stores

  • Use LSM-tree style approach for writes.
  • Data is buffered in memory and then flushed to disk in batches.
  • Some systems store the same data in multiple sort orders to optimize different query patterns.

Twitter, Facebook