TRACK B · HLD · MODULE B3 · WEEK 13

CACHING

Cache-Aside · Write-Through · Write-Back · Stampede Prevention
LRU · LFU · TTL · Redis Structures · CDN · Rate Limiting
10
TOPICS
4
TASKS
5
REDIS STRUCTURES
B3
MODULE
100ns
RAM
0.5ms
Redis
1-5ms
SSD DB
10-50ms
HDD DB
Redis is 10–100× faster than a DB query
In-process cache is 100,000× faster than disk
Cache = trading space for time
Cache Patterns
Four strategies — choose based on consistency needs and write frequency
Cache-Aside
MOST COMMON
Application controls all cache interactions. Miss → app fetches DB and populates cache. Write → app invalidates (deletes) cache key.
R1 cache.get(key) → HIT → return
R2 MISS → db.get(key)
R3 cache.set(key, val, TTL)
W1 db.update(key, val)
W2 cache.delete(key) ← safer than update
✓ Only caches requested data (no wasted memory)
✓ DB failures degrade gracefully
✗ First request after miss is slow
✗ Brief stale window between write and invalidation
Write-Through
STRONG CONSISTENCY
Every write goes to BOTH cache and DB synchronously. Client confirmed only after both succeed. Reads always find fresh data in cache.
W1 db.write(key, val)
W2 cache.set(key, val)
W3 confirm to client
R1 cache.get(key) → always fresh
✓ Read-after-write consistency guaranteed
✓ Cache always has latest data
✗ Adds latency to every write (two writes)
✗ Caches data that may never be re-read
Write-Back
HIGH-WRITE WORKLOADS
Write to cache immediately (fast ACK). DB updated asynchronously in background. Risk: data loss if cache crashes before flush.
W1 cache.set(key, val) → ACK client
W2 (background) db.write(key, val)
R1 cache.get(key) → always fresh
✓ Lowest write latency
✓ Batches DB writes (more efficient)
✗ Data loss if cache crashes before flush
✗ DB temporarily inconsistent with cache
Read-Through
SIMPLE APP CODE
Cache layer transparently fetches from DB on miss. Application talks only to cache — never to DB directly. Cache library handles miss logic.
R1 app.get(key) → asks cache
R2 cache HIT → return val
R3 MISS → cache fetches DB
R4 cache stores + returns val
✓ Simpler application code
✓ Cache manages all miss logic
✗ Cold start: all first reads slow
✗ Less control over what gets cached
Cache-Aside in Java — the full patternJAVA
class UserService {
    private final Cache  cache;
    private final UserDB db;

    public User getUser(long userId) {
        String key = "user:" + userId;

        User cached = cache.get(key);
        if (cached != null) return cached;   // ✅ CACHE HIT

        User user = db.findById(userId);        // DB query
        cache.set(key, user, Duration.ofMinutes(30)); // Populate with TTL
        return user;                            // CACHE MISS
    }

    public void updateUser(long userId, UserUpdate update) {
        db.update(userId, update);
        cache.delete("user:" + userId); // ← DELETE safer than SET (avoids stale-write race)
    }
}
Hit rate target: Cache is only worthwhile if hit rate exceeds ~80%. Monitor your cache hit rate constantly. If it's low: cache size too small, TTL too short, or you're caching long-tail data that's rarely re-read.
Eviction Policies
When the cache is full — which entry gets removed?
LRU
Least Recently Used. Evict entry not accessed for longest time. Recently accessed = likely to be re-accessed.
Implementation: DoublyLinkedList + HashMap
O(1) get, O(1) put, O(1) evict
Default in Redis (allkeys-lru)
LFU
Least Frequently Used. Evict entry with fewest total accesses. Avoids cache pollution from one-time large scans.
Implementation: HashMap + frequency heap
O(1) amortised with Min-Heap
Redis allkeys-lfu (since Redis 4.0)
TTL
Time-To-Live. Each entry has an expiry timestamp. After TTL, entry is invalid and re-fetched on next access.
cache.set(key, val, 3600)
Use TTL jitter to avoid stampede:
TTL = base + random(0, base*0.1)
LRU Cache — DoublyLinkedList + HashMap (O(1) all ops)JAVA
class LRUCache {
    private final int                          capacity;
    private final Map<int, Node>             map = new HashMap<>();
    private final Node                        head, tail; // Sentinel nodes
    private final ReentrantReadWriteLock      lock = new ReentrantReadWriteLock();

    public int get(int key) {
        lock.readLock().lock();
        try {
            Node n = map.get(key);
            if (n == null) return -1;
            moveToFront(n);  // Most-recently-used → head
            return n.val;
        } finally { lock.readLock().unlock(); }
    }

    public void put(int key, int val) {
        lock.writeLock().lock();
        try {
            if (map.containsKey(key)) {
                Node n = map.get(key); n.val = val; moveToFront(n);
            } else {
                Node n = new Node(key, val);
                map.put(key, n); addToFront(n);
                if (map.size() > capacity) evictTail(); // Remove LRU
            }
        } finally { lock.writeLock().unlock(); }
    }
}
TTL Jitter pattern: If many cache keys share the same TTL, they all expire simultaneously → thundering herd. Always add random jitter: TTL = base_ttl + random(0, base_ttl * 0.1). This spreads expiry events across time and prevents concurrent stampedes.
Cache Invalidation
"Only two hard things in CS: cache invalidation and naming things" — Phil Karlton
Three invalidation strategies comparedPATTERNS
Strategy 1: Delete on Write (safest, most common)
void updateProduct(id, product) {
    db.update(id, product);
    cache.delete("product:" + id);                          // ← DELETE, not SET
    cache.delete("products:category:" + product.categoryId); // ← invalidate list too
}
// Next read misses → fetches fresh from DB → repopulates
// Safe: avoids stale-write race condition

Strategy 2: Version-Based Keys (no explicit invalidation)
// Embed version in key. When data changes, bump version.
"product:42:v8""product:42:v9"   // old key naturally expires via TTL
// Application always reads latest version key.
// Great for: config data, feature flags, rarely-changing reference data

Strategy 3: Event-Based (CDC + Kafka)
// DB → Change Data Capture → Kafka → Cache Invalidation Service → Redis.delete(key)
DB_writes → Debezium/CDC → Kafka.publish(change_event) → CacheConsumer → cache.delete(key)
// Decoupled, real-time, works for distributed caches
// Cost: more infrastructure, event delivery latency, ordering guarantees needed
The Invalidation Race Condition — and how to prevent itCONCURRENT
// Timeline of a nasty race:
Thread A: UPDATE product WHERE id=42        // starts write
Thread B: SELECT product WHERE id=42 → MISS   // reads old DB value (before A commits)
Thread A: cache.delete("product:42")         // invalidates
Thread B: cache.set("product:42", stale_val) // writes OLD value back!
// Result: cache holds stale data indefinitely until TTL expires 😱

Prevention:
  1. Use TTL as a safety net — even if stale, it expires eventually
  2. Use CAS (Compare-And-Set) — only write to cache if key still absent
     cache.setnx("product:42", freshVal)  ← only sets if key doesn't exist
  3. Use write lock during update:
     db.beginTransaction();
     db.update(id, product);
     cache.delete(key);        // ← delete INSIDE transaction before commit
     db.commit();
Never "update" cache on write — always delete. If you SET the cache on write and also SET it on read-miss, two concurrent operations can race to write stale vs fresh values. Deleting is atomic and safe — the next read will always get fresh data from the DB.
Cache Stampede (Thundering Herd)
When a popular cache key expires — 10,000 concurrent requests all hit the DB
// STAMPEDE VISUALISED — popular key expires at T=0
DB load
OVERLOAD
Latency
SPIKING
Cache hit
RESTORING
⚡ All concurrent misses pile onto DB simultaneously → overload → latency cascade
Three stampede prevention strategiesPATTERNS
Strategy 1: Mutex/Lock on Miss (most reliable)
String get(String key) {
    String val = cache.get(key);
    if (val != null) return val;                 // HIT — fast path

    // MISS — exactly ONE thread refreshes; others wait
    boolean locked = cache.setnx("lock:"+key, "1", 10_seconds);
    if (locked) {
        try {
            val = db.get(key);
            cache.set(key, val, TTL);
            return val;
        } finally { cache.delete("lock:"+key); }
    } else {
        Thread.sleep(50); return get(key); // Retry — lock holder will populate cache
    }
}

Strategy 2: Probabilistic Early Expiration (PER)
// Before TTL expires, probabilistically start refreshing
if (ttlRemaining < -Math.log(random()) * recomputeTime) {
    refresh(); // One request triggers early refresh; others get (slightly stale) cached value
}

Strategy 3: Background Refresh (keep popular keys always warm)
// Scheduled job: refresh popular keys 30s before TTL expires
// Requires: popularity tracking (hit count per key)
scheduler.scheduleAtFixedRate(() -> hotKeys.forEach(k -> cache.refresh(k)), 30, 30, SECONDS);
CDN stale-while-revalidate: Modern CDNs support Cache-Control: stale-while-revalidate=30 — serve stale content immediately while refreshing in background. This eliminates cache misses for CDN-served content entirely. The same principle applies to application-level caches.
Redis Data Structures
Five structures — each optimised for a specific access pattern
🔑
String
KEY → VALUE
SET user:42:name "Ajay"
GET user:42:name
INCR view_count:post:123
SETEX session:abc 3600 data
USE: Counters, flags, sessions, rate limit tokens, feature flags
📦
Hash
KEY → {FIELD: VALUE}
HSET user:42 name "Ajay" tier "pro"
HGET user:42 name
HMGET user:42 name tier email
HINCRBY user:42 login_count 1
USE: User profiles, config objects, shopping carts — fetch individual fields
📋
List
ORDERED · DUPLICATES OK
RPUSH queue:email email1 email2
LPOP queue:email
LRANGE feed:user:42 0 49
LLEN queue:email
USE: Job queues (FIFO), activity feeds, recent items, notifications
🎯
Set
UNIQUE MEMBERS
SADD active_users 42 88 123
SISMEMBER active_users 42
SUNION tags:post:1 tags:post:2
SINTER followers:A followers:B
USE: Tagging, unique visitor tracking, mutual friends, union/intersection
🏆
Sorted Set
SCORE-RANKED MEMBERS
ZADD leaderboard 9800 "alice"
ZRANGE lb 0 9 WITHSCORES REV
ZADD rate:user:42 {ts} {reqId}
ZRANGEBYSCORE rate:user:42 (now-60) now
USE: Leaderboards, sliding window rate limiting, priority queues, geo proximity
Sliding Window Rate Limiter — Redis Sorted SetREDIS
// Per-user: 100 requests per 60-second window
// ZADD rate_limit:{userId} {timestamp_ms} {unique_request_id}
// ZRANGEBYSCORE rate_limit:{userId} (now-60000) now → requests in last 60s

boolean isAllowed(String userId, int limit, long windowMs) {
    long   now        = System.currentTimeMillis();
    long   windowStart = now - windowMs;
    String key        = "rate:" + userId;

    return redis.pipeline(pipe -> {
        pipe.zremrangebyscore(key, 0, windowStart);          // Remove expired entries
        pipe.zadd(key, now, UUID.randomUUID().toString()); // Add this request
        pipe.zcard(key);                                       // Count in window
        pipe.expire(key, windowMs / 1000);                    // Auto-expire key
    }).getCard() <= limit;
}
// ✅ Exact sliding window · ✅ Multi-server safe (shared Redis)
// Memory: O(requests in window) per user — scale with caution for large limits
CDN — Content Delivery Network
Geographic caching at the edge — serving content from nearest node
// WITHOUT CDN vs WITH CDN
No CDN:
User
(Mumbai)
──────────────────────→
Origin
(US-East)
~200ms every request
With CDN:
User
(Mumbai)
──→
CDN Edge
(Singapore)
← HIT
~15ms ✅
CDN Edge
(Singapore)
──MISS──→
Origin
(US-East)
~200ms (first time only)
Cache-Control headers for static vs dynamic vs privateHTTP
// STATIC content (images, CSS, JS, fonts)
Cache-Control: max-age=31536000, immutable
// CDN caches for 1 year. Use hash-based filenames for cache busting.
app.js → app.8f3d92ab.js   // Content hash in filename → new deploy = new URL = fresh CDN cache

// DYNAMIC content (API responses, partially personalised pages)
Cache-Control: max-age=60, stale-while-revalidate=30
// CDN caches for 60s. During next 30s after expiry, serve stale while fetching fresh.
// Eliminates miss latency for CDN — no thundering herd at origin.

// PRIVATE content (user-specific responses)
Cache-Control: private, no-store
// CDN does NOT cache. Request always reaches origin. Each user's data is unique.

// Vary header — separate cache per encoding/format
Vary: Accept-Encoding
// CDN stores separate versions for gzip and uncompressed.
INTERVIEW TIPS — CACHING
QUESTIONSTRONG ANSWER
"How does cache-aside work?"Application checks cache; on miss queries DB and populates with TTL. On write, deletes (not updates) cache key. Delete is safer — avoids stale-write race.
"What is thundering herd?"Popular key expires → thousands of concurrent misses all query DB simultaneously → overload. Prevention: mutex lock on miss, probabilistic early expiration, or background refresh.
"Write-through vs write-back?"Write-through: strong consistency, adds write latency. Write-back: fast writes, risk of data loss if cache crashes before flush. Use write-back for non-critical high-frequency counters (view counts).
"Redis data structure for rate limiting?"Sorted Set with score=timestamp, member=requestId. Sliding window: ZRANGEBYSCORE in (now-window, now) + ZCARD = count in window. Atomic with pipeline.
"How to handle cache invalidation?"Delete on write is safest. TTL as safety net. For complex dependencies, event-based invalidation via CDC + Kafka. Never set stale value — always delete and let next read fetch fresh.
01
Cache Strategy Selection — 6 Scenarios
~1 hr

For each: choose cache-aside / write-through / write-back / no-cache. Justify including the consistency trade-off you're accepting.

  1. E-commerce product page (price + inventory — overselling is bad)
  2. Social media post like count (500M likes/day)
  3. Bank account balance before authorising a debit
  4. User authentication session (JWT + server-side session)
  5. Trending hashtags computed every 5 minutes
  6. Real-time stock price feed (traders need latest price)

For each wrong choice: what bad outcome would it cause?

02
Thread-Safe LRU Cache with TTL
~2 hrs · code
API:
  int get(int key)                          → -1 if absent or expired
  void put(int key, int value, long ttlMs)  → evict LRU if at capacity

Requirements:
  - O(1) get and put (DoublyLinkedList + HashMap)
  - Thread-safe: concurrent get/put from multiple threads
  - TTL per entry: expired entries not returned, evicted lazily on access
  - Bonus: background sweeper thread evicts expired entries proactively

Test: 8 threads × 100K ops, verify:
  - Zero ConcurrentModificationException
  - Eviction order correct (LRU evicted first, not MRU)
  - TTL respected: entries expired after their TTL not returned
03
Cache Stampede Prevention Design
~1.5 hrs

A news website homepage is cached in Redis with TTL=60s. The homepage is computed from 50 DB queries taking 500ms. The site has 1M concurrent users.

Problem: Every 60 seconds, a stampede floods the DB with 50K simultaneous queries.

Design a solution that:

  1. Eliminates the stampede completely
  2. Keeps homepage latency <50ms for 99% of requests
  3. Handles Redis failure gracefully (fallback strategy)
  4. Explain the consistency trade-off you're making

Pick one of the three strategies (mutex, PER, background refresh) and justify why it's best for this scenario.

Distributed Rate Limiter — 3 Algorithms
~2 hrs · code

Implement a distributed rate limiter: 100 requests per 60-second window per user. Multi-server deployment with shared Redis.

Implement all three and compare:

A) Fixed Window Counter:
   INCR rate:{userId}:{minute}
   EXPIRE rate:{userId}:{minute} 60
   Simple but allows burst at window boundary:
   100 at 0:59 + 100 at 1:00 = 200 requests in 2 seconds

B) Sliding Window Log (Sorted Set):
   ZADD rate:{userId} {timestamp} {requestId}
   ZREMRANGEBYSCORE to remove old entries
   ZCARD for count in window
   Exact but O(requests) memory per user

C) Sliding Window Counter (hybrid):
   Count of previous window × (1 - elapsed/window) + current window count
   Approximate but O(1) memory

For each:
  - Show Redis commands
  - Time complexity per request
  - Memory per user
  - Accuracy at window boundary
  - When to use in production
0 / 13 completedMODULE B3 · CACHING
Know the latency numbers: RAM vs Redis vs DB — WHY cache works
Cache-aside: full read + write path, why delete is safer than update
Write-through vs write-back: trade-offs and when each is appropriate
LRU: DoublyLinkedList + HashMap implementation, O(1) all ops
LFU vs LRU: when LFU is better (avoids cache pollution from scans)
TTL jitter: why it prevents stampede and how to implement
Cache invalidation race condition: why delete is safe, update is not
Cache stampede: 3 prevention strategies (mutex, PER, background refresh)
All 5 Redis data structures + their use cases
Sliding window rate limiter using Redis Sorted Set
CDN: static vs dynamic caching, cache busting, stale-while-revalidate
✏️ Tasks 1–3: strategy selection, LRU with TTL, stampede design
✏️ Capstone: distributed rate limiter — all 3 algorithms implemented
// NEXT MODULE
B4 — Message Queues
Why async messaging · Kafka architecture + partitions + consumer groups
RabbitMQ vs Kafka · At-least-once vs exactly-once · Dead letter queues
Fan-out patterns · Event sourcing · CQRS · Backpressure
← B2: Caching 📄 Full Notes ↑ Roadmap B4: Message Queues →