TRACK B · HLD · MODULE B3 · WEEK 13

CACHING

Cache-Aside · Write-Through · Write-Back · Stampede Prevention
LRU · LFU · TTL · Redis Structures · CDN · Rate Limiting

TOPICS

TASKS

REDIS STRUCTURES

MODULE

100ns

RAM

0.5ms

Redis

1-5ms

SSD DB

10-50ms

HDD DB

      Redis is 10–100× faster than a DB query

      In-process cache is 100,000× faster than disk

      Cache = trading space for time

Cache Patterns

Four strategies — choose based on consistency needs and write frequency

Cache-Aside

MOST COMMON

Application controls all cache interactions. Miss → app fetches DB and populates cache. Write → app invalidates (deletes) cache key.

R1 cache.get(key) → HIT → return

R2 MISS → db.get(key)

R3 cache.set(key, val, TTL)

W1 db.update(key, val)

W2 cache.delete(key) ← safer than update

✓ Only caches requested data (no wasted memory)
✓ DB failures degrade gracefully

✗ First request after miss is slow
✗ Brief stale window between write and invalidation

Write-Through

STRONG CONSISTENCY

Every write goes to BOTH cache and DB synchronously. Client confirmed only after both succeed. Reads always find fresh data in cache.

W1 db.write(key, val)

W2 cache.set(key, val)

W3 confirm to client

R1 cache.get(key) → always fresh

✓ Read-after-write consistency guaranteed
✓ Cache always has latest data

✗ Adds latency to every write (two writes)
✗ Caches data that may never be re-read

Write-Back

HIGH-WRITE WORKLOADS

Write to cache immediately (fast ACK). DB updated asynchronously in background. Risk: data loss if cache crashes before flush.

W1 cache.set(key, val) → ACK client

W2 (background) db.write(key, val)

R1 cache.get(key) → always fresh

✓ Lowest write latency
✓ Batches DB writes (more efficient)

✗ Data loss if cache crashes before flush
✗ DB temporarily inconsistent with cache

Read-Through

SIMPLE APP CODE

Cache layer transparently fetches from DB on miss. Application talks only to cache — never to DB directly. Cache library handles miss logic.

R1 app.get(key) → asks cache

R2 cache HIT → return val

R3 MISS → cache fetches DB

R4 cache stores + returns val

✓ Simpler application code
✓ Cache manages all miss logic

✗ Cold start: all first reads slow
✗ Less control over what gets cached

Cache-Aside in Java — the full patternJAVA

class UserService {
    private final Cache  cache;
    private final UserDB db;

    public User getUser(long userId) {
        String key = "user:" + userId;

        User cached = cache.get(key);
        if (cached != null) return cached;   // ✅ CACHE HIT

        User user = db.findById(userId);        // DB query
        cache.set(key, user, Duration.ofMinutes(30)); // Populate with TTL
        return user;                            // CACHE MISS
    }

    public void updateUser(long userId, UserUpdate update) {
        db.update(userId, update);
        cache.delete("user:" + userId); // ← DELETE safer than SET (avoids stale-write race)
    }
}

Hit rate target: Cache is only worthwhile if hit rate exceeds ~80%. Monitor your cache hit rate constantly. If it's low: cache size too small, TTL too short, or you're caching long-tail data that's rarely re-read.

Eviction Policies

When the cache is full — which entry gets removed?

LRU

Least Recently Used. Evict entry not accessed for longest time. Recently accessed = likely to be re-accessed.

Implementation: DoublyLinkedList + HashMap
O(1) get, O(1) put, O(1) evict
Default in Redis (allkeys-lru)

LFU

Least Frequently Used. Evict entry with fewest total accesses. Avoids cache pollution from one-time large scans.

Implementation: HashMap + frequency heap
O(1) amortised with Min-Heap
Redis allkeys-lfu (since Redis 4.0)

TTL

Time-To-Live. Each entry has an expiry timestamp. After TTL, entry is invalid and re-fetched on next access.

cache.set(key, val, 3600)
Use TTL jitter to avoid stampede:
TTL = base + random(0, base*0.1)

LRU Cache — DoublyLinkedList + HashMap (O(1) all ops)JAVA

class LRUCache {
    private final int                          capacity;
    private final Map<int, Node>             map = new HashMap<>();
    private final Node                        head, tail; // Sentinel nodes
    private final ReentrantReadWriteLock      lock = new ReentrantReadWriteLock();

    public int get(int key) {
        lock.readLock().lock();
        try {
            Node n = map.get(key);
            if (n == null) return -1;
            moveToFront(n);  // Most-recently-used → head
            return n.val;
        } finally { lock.readLock().unlock(); }
    }

    public void put(int key, int val) {
        lock.writeLock().lock();
        try {
            if (map.containsKey(key)) {
                Node n = map.get(key); n.val = val; moveToFront(n);
            } else {
                Node n = new Node(key, val);
                map.put(key, n); addToFront(n);
                if (map.size() > capacity) evictTail(); // Remove LRU
            }
        } finally { lock.writeLock().unlock(); }
    }
}

TTL Jitter pattern: If many cache keys share the same TTL, they all expire simultaneously → thundering herd. Always add random jitter: TTL = base_ttl + random(0, base_ttl * 0.1). This spreads expiry events across time and prevents concurrent stampedes.

Cache Invalidation

"Only two hard things in CS: cache invalidation and naming things" — Phil Karlton

Three invalidation strategies comparedPATTERNS

Strategy 1: Delete on Write (safest, most common)
void updateProduct(id, product) {
    db.update(id, product);
    cache.delete("product:" + id);                          // ← DELETE, not SET
    cache.delete("products:category:" + product.categoryId); // ← invalidate list too
}
// Next read misses → fetches fresh from DB → repopulates
// Safe: avoids stale-write race condition

Strategy 2: Version-Based Keys (no explicit invalidation)
// Embed version in key. When data changes, bump version.
"product:42:v8"  →  "product:42:v9"   // old key naturally expires via TTL
// Application always reads latest version key.
// Great for: config data, feature flags, rarely-changing reference data

Strategy 3: Event-Based (CDC + Kafka)
// DB → Change Data Capture → Kafka → Cache Invalidation Service → Redis.delete(key)
DB_writes → Debezium/CDC → Kafka.publish(change_event) → CacheConsumer → cache.delete(key)
// Decoupled, real-time, works for distributed caches
// Cost: more infrastructure, event delivery latency, ordering guarantees needed

The Invalidation Race Condition — and how to prevent itCONCURRENT

// Timeline of a nasty race:
Thread A: UPDATE product WHERE id=42        // starts write
Thread B: SELECT product WHERE id=42 → MISS   // reads old DB value (before A commits)
Thread A: cache.delete("product:42")         // invalidates
Thread B: cache.set("product:42", stale_val) // writes OLD value back!
// Result: cache holds stale data indefinitely until TTL expires 😱

Prevention:
  1. Use TTL as a safety net — even if stale, it expires eventually
  2. Use CAS (Compare-And-Set) — only write to cache if key still absent
     cache.setnx("product:42", freshVal)  ← only sets if key doesn't exist
  3. Use write lock during update:
     db.beginTransaction();
     db.update(id, product);
     cache.delete(key);        // ← delete INSIDE transaction before commit
     db.commit();

Never "update" cache on write — always delete. If you SET the cache on write and also SET it on read-miss, two concurrent operations can race to write stale vs fresh values. Deleting is atomic and safe — the next read will always get fresh data from the DB.

Cache Stampede (Thundering Herd)

When a popular cache key expires — 10,000 concurrent requests all hit the DB

// STAMPEDE VISUALISED — popular key expires at T=0

DB load

OVERLOAD

Latency

SPIKING

Cache hit

RESTORING

⚡ All concurrent misses pile onto DB simultaneously → overload → latency cascade

Three stampede prevention strategiesPATTERNS

Strategy 1: Mutex/Lock on Miss (most reliable)
String get(String key) {
    String val = cache.get(key);
    if (val != null) return val;                 // HIT — fast path

    // MISS — exactly ONE thread refreshes; others wait
    boolean locked = cache.setnx("lock:"+key, "1", 10_seconds);
    if (locked) {
        try {
            val = db.get(key);
            cache.set(key, val, TTL);
            return val;
        } finally { cache.delete("lock:"+key); }
    } else {
        Thread.sleep(50); return get(key); // Retry — lock holder will populate cache
    }
}

Strategy 2: Probabilistic Early Expiration (PER)
// Before TTL expires, probabilistically start refreshing
if (ttlRemaining < -Math.log(random()) * recomputeTime) {
    refresh(); // One request triggers early refresh; others get (slightly stale) cached value
}

Strategy 3: Background Refresh (keep popular keys always warm)
// Scheduled job: refresh popular keys 30s before TTL expires
// Requires: popularity tracking (hit count per key)
scheduler.scheduleAtFixedRate(() -> hotKeys.forEach(k -> cache.refresh(k)), 30, 30, SECONDS);

CDN stale-while-revalidate: Modern CDNs support Cache-Control: stale-while-revalidate=30 — serve stale content immediately while refreshing in background. This eliminates cache misses for CDN-served content entirely. The same principle applies to application-level caches.

Redis Data Structures

Five structures — each optimised for a specific access pattern

🔑

String

KEY → VALUE

SET user:42:name "Ajay"
GET user:42:name
INCR view_count:post:123
SETEX session:abc 3600 data

USE: Counters, flags, sessions, rate limit tokens, feature flags

📦

Hash

KEY → {FIELD: VALUE}

HSET user:42 name "Ajay" tier "pro"
HGET user:42 name
HMGET user:42 name tier email
HINCRBY user:42 login_count 1

USE: User profiles, config objects, shopping carts — fetch individual fields

📋

List

ORDERED · DUPLICATES OK

RPUSH queue:email email1 email2
LPOP queue:email
LRANGE feed:user:42 0 49
LLEN queue:email

USE: Job queues (FIFO), activity feeds, recent items, notifications

🎯

Set

UNIQUE MEMBERS

SADD active_users 42 88 123
SISMEMBER active_users 42
SUNION tags:post:1 tags:post:2
SINTER followers:A followers:B

USE: Tagging, unique visitor tracking, mutual friends, union/intersection

🏆

Sorted Set

SCORE-RANKED MEMBERS

ZADD leaderboard 9800 "alice"
ZRANGE lb 0 9 WITHSCORES REV
ZADD rate:user:42 {ts} {reqId}
ZRANGEBYSCORE rate:user:42 (now-60) now

USE: Leaderboards, sliding window rate limiting, priority queues, geo proximity

Sliding Window Rate Limiter — Redis Sorted SetREDIS

// Per-user: 100 requests per 60-second window
// ZADD rate_limit:{userId} {timestamp_ms} {unique_request_id}
// ZRANGEBYSCORE rate_limit:{userId} (now-60000) now → requests in last 60s

boolean isAllowed(String userId, int limit, long windowMs) {
    long   now        = System.currentTimeMillis();
    long   windowStart = now - windowMs;
    String key        = "rate:" + userId;

    return redis.pipeline(pipe -> {
        pipe.zremrangebyscore(key, 0, windowStart);          // Remove expired entries
        pipe.zadd(key, now, UUID.randomUUID().toString()); // Add this request
        pipe.zcard(key);                                       // Count in window
        pipe.expire(key, windowMs / 1000);                    // Auto-expire key
    }).getCard() <= limit;
}
// ✅ Exact sliding window · ✅ Multi-server safe (shared Redis)
// Memory: O(requests in window) per user — scale with caution for large limits

CDN — Content Delivery Network

Geographic caching at the edge — serving content from nearest node

// WITHOUT CDN vs WITH CDN

No CDN:

User
(Mumbai)

──────────────────────→

Origin
(US-East)

~200ms every request

With CDN:

User
(Mumbai)

──→

CDN Edge
(Singapore)

← HIT

~15ms ✅

CDN Edge
(Singapore)

──MISS──→

Origin
(US-East)

~200ms (first time only)

Cache-Control headers for static vs dynamic vs privateHTTP

// STATIC content (images, CSS, JS, fonts)
Cache-Control: max-age=31536000, immutable
// CDN caches for 1 year. Use hash-based filenames for cache busting.
app.js → app.8f3d92ab.js   // Content hash in filename → new deploy = new URL = fresh CDN cache

// DYNAMIC content (API responses, partially personalised pages)
Cache-Control: max-age=60, stale-while-revalidate=30
// CDN caches for 60s. During next 30s after expiry, serve stale while fetching fresh.
// Eliminates miss latency for CDN — no thundering herd at origin.

// PRIVATE content (user-specific responses)
Cache-Control: private, no-store
// CDN does NOT cache. Request always reaches origin. Each user's data is unique.

// Vary header — separate cache per encoding/format
Vary: Accept-Encoding
// CDN stores separate versions for gzip and uncompressed.

INTERVIEW TIPS — CACHING

QUESTION	STRONG ANSWER
"How does cache-aside work?"	Application checks cache; on miss queries DB and populates with TTL. On write, deletes (not updates) cache key. Delete is safer — avoids stale-write race.
"What is thundering herd?"	Popular key expires → thousands of concurrent misses all query DB simultaneously → overload. Prevention: mutex lock on miss, probabilistic early expiration, or background refresh.
"Write-through vs write-back?"	Write-through: strong consistency, adds write latency. Write-back: fast writes, risk of data loss if cache crashes before flush. Use write-back for non-critical high-frequency counters (view counts).
"Redis data structure for rate limiting?"	Sorted Set with score=timestamp, member=requestId. Sliding window: ZRANGEBYSCORE in (now-window, now) + ZCARD = count in window. Atomic with pipeline.
"How to handle cache invalidation?"	Delete on write is safest. TTL as safety net. For complex dependencies, event-based invalidation via CDC + Kafka. Never set stale value — always delete and let next read fetch fresh.

Cache Strategy Selection — 6 Scenarios

~1 hr

›

For each: choose cache-aside / write-through / write-back / no-cache. Justify including the consistency trade-off you're accepting.

E-commerce product page (price + inventory — overselling is bad)
Social media post like count (500M likes/day)
Bank account balance before authorising a debit
User authentication session (JWT + server-side session)
Trending hashtags computed every 5 minutes
Real-time stock price feed (traders need latest price)

For each wrong choice: what bad outcome would it cause?

Thread-Safe LRU Cache with TTL

~2 hrs · code

›

API:
  int get(int key)                          → -1 if absent or expired
  void put(int key, int value, long ttlMs)  → evict LRU if at capacity

Requirements:
  - O(1) get and put (DoublyLinkedList + HashMap)
  - Thread-safe: concurrent get/put from multiple threads
  - TTL per entry: expired entries not returned, evicted lazily on access
  - Bonus: background sweeper thread evicts expired entries proactively

Test: 8 threads × 100K ops, verify:
  - Zero ConcurrentModificationException
  - Eviction order correct (LRU evicted first, not MRU)
  - TTL respected: entries expired after their TTL not returned

Cache Stampede Prevention Design

~1.5 hrs

›

A news website homepage is cached in Redis with TTL=60s. The homepage is computed from 50 DB queries taking 500ms. The site has 1M concurrent users.

Problem: Every 60 seconds, a stampede floods the DB with 50K simultaneous queries.

Design a solution that:

Eliminates the stampede completely
Keeps homepage latency <50ms for 99% of requests
Handles Redis failure gracefully (fallback strategy)
Explain the consistency trade-off you're making

Pick one of the three strategies (mutex, PER, background refresh) and justify why it's best for this scenario.

★

Distributed Rate Limiter — 3 Algorithms

~2 hrs · code

›

Implement a distributed rate limiter: 100 requests per 60-second window per user. Multi-server deployment with shared Redis.

Implement all three and compare:

A) Fixed Window Counter:
   INCR rate:{userId}:{minute}
   EXPIRE rate:{userId}:{minute} 60
   Simple but allows burst at window boundary:
   100 at 0:59 + 100 at 1:00 = 200 requests in 2 seconds

B) Sliding Window Log (Sorted Set):
   ZADD rate:{userId} {timestamp} {requestId}
   ZREMRANGEBYSCORE to remove old entries
   ZCARD for count in window
   Exact but O(requests) memory per user

C) Sliding Window Counter (hybrid):
   Count of previous window × (1 - elapsed/window) + current window count
   Approximate but O(1) memory

For each:
  - Show Redis commands
  - Time complexity per request
  - Memory per user
  - Accuracy at window boundary
  - When to use in production

0 / 13 completedMODULE B3 · CACHING

Know the latency numbers: RAM vs Redis vs DB — WHY cache works

Cache-aside: full read + write path, why delete is safer than update

Write-through vs write-back: trade-offs and when each is appropriate

LRU: DoublyLinkedList + HashMap implementation, O(1) all ops

LFU vs LRU: when LFU is better (avoids cache pollution from scans)

TTL jitter: why it prevents stampede and how to implement

Cache invalidation race condition: why delete is safe, update is not

Cache stampede: 3 prevention strategies (mutex, PER, background refresh)

All 5 Redis data structures + their use cases

Sliding window rate limiter using Redis Sorted Set

CDN: static vs dynamic caching, cache busting, stale-while-revalidate

✏️ Tasks 1–3: strategy selection, LRU with TTL, stampede design

✏️ Capstone: distributed rate limiter — all 3 algorithms implemented

// NEXT MODULE

B4 — Message Queues

      Why async messaging · Kafka architecture + partitions + consumer groups

      RabbitMQ vs Kafka · At-least-once vs exactly-once · Dead letter queues

      Fan-out patterns · Event sourcing · CQRS · Backpressure