Cache Patterns
Four strategies — choose based on consistency needs and write frequency
Cache-Aside
MOST COMMON
Application controls all cache interactions. Miss → app fetches DB and populates cache. Write → app invalidates (deletes) cache key.
R1 cache.get(key) → HIT → return
R2 MISS → db.get(key)
R3 cache.set(key, val, TTL)
W1 db.update(key, val)
W2 cache.delete(key) ← safer than update
✓ Only caches requested data (no wasted memory)
✓ DB failures degrade gracefully
✓ DB failures degrade gracefully
✗ First request after miss is slow
✗ Brief stale window between write and invalidation
✗ Brief stale window between write and invalidation
Write-Through
STRONG CONSISTENCY
Every write goes to BOTH cache and DB synchronously. Client confirmed only after both succeed. Reads always find fresh data in cache.
W1 db.write(key, val)
W2 cache.set(key, val)
W3 confirm to client
R1 cache.get(key) → always fresh
✓ Read-after-write consistency guaranteed
✓ Cache always has latest data
✓ Cache always has latest data
✗ Adds latency to every write (two writes)
✗ Caches data that may never be re-read
✗ Caches data that may never be re-read
Write-Back
HIGH-WRITE WORKLOADS
Write to cache immediately (fast ACK). DB updated asynchronously in background. Risk: data loss if cache crashes before flush.
W1 cache.set(key, val) → ACK client
W2 (background) db.write(key, val)
R1 cache.get(key) → always fresh
✓ Lowest write latency
✓ Batches DB writes (more efficient)
✓ Batches DB writes (more efficient)
✗ Data loss if cache crashes before flush
✗ DB temporarily inconsistent with cache
✗ DB temporarily inconsistent with cache
Read-Through
SIMPLE APP CODE
Cache layer transparently fetches from DB on miss. Application talks only to cache — never to DB directly. Cache library handles miss logic.
R1 app.get(key) → asks cache
R2 cache HIT → return val
R3 MISS → cache fetches DB
R4 cache stores + returns val
✓ Simpler application code
✓ Cache manages all miss logic
✓ Cache manages all miss logic
✗ Cold start: all first reads slow
✗ Less control over what gets cached
✗ Less control over what gets cached
Cache-Aside in Java — the full patternJAVA
class UserService { private final Cache cache; private final UserDB db; public User getUser(long userId) { String key = "user:" + userId; User cached = cache.get(key); if (cached != null) return cached; // ✅ CACHE HIT User user = db.findById(userId); // DB query cache.set(key, user, Duration.ofMinutes(30)); // Populate with TTL return user; // CACHE MISS } public void updateUser(long userId, UserUpdate update) { db.update(userId, update); cache.delete("user:" + userId); // ← DELETE safer than SET (avoids stale-write race) } }
Hit rate target: Cache is only worthwhile if hit rate exceeds ~80%. Monitor your cache hit rate constantly. If it's low: cache size too small, TTL too short, or you're caching long-tail data that's rarely re-read.
Eviction Policies
When the cache is full — which entry gets removed?
LRU
Least Recently Used. Evict entry not accessed for longest time. Recently accessed = likely to be re-accessed.
Implementation: DoublyLinkedList + HashMap
O(1) get, O(1) put, O(1) evict
Default in Redis (allkeys-lru)
O(1) get, O(1) put, O(1) evict
Default in Redis (allkeys-lru)
LFU
Least Frequently Used. Evict entry with fewest total accesses. Avoids cache pollution from one-time large scans.
Implementation: HashMap + frequency heap
O(1) amortised with Min-Heap
Redis allkeys-lfu (since Redis 4.0)
O(1) amortised with Min-Heap
Redis allkeys-lfu (since Redis 4.0)
TTL
Time-To-Live. Each entry has an expiry timestamp. After TTL, entry is invalid and re-fetched on next access.
cache.set(key, val, 3600)
Use TTL jitter to avoid stampede:
TTL = base + random(0, base*0.1)
Use TTL jitter to avoid stampede:
TTL = base + random(0, base*0.1)
LRU Cache — DoublyLinkedList + HashMap (O(1) all ops)JAVA
class LRUCache { private final int capacity; private final Map<int, Node> map = new HashMap<>(); private final Node head, tail; // Sentinel nodes private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); public int get(int key) { lock.readLock().lock(); try { Node n = map.get(key); if (n == null) return -1; moveToFront(n); // Most-recently-used → head return n.val; } finally { lock.readLock().unlock(); } } public void put(int key, int val) { lock.writeLock().lock(); try { if (map.containsKey(key)) { Node n = map.get(key); n.val = val; moveToFront(n); } else { Node n = new Node(key, val); map.put(key, n); addToFront(n); if (map.size() > capacity) evictTail(); // Remove LRU } } finally { lock.writeLock().unlock(); } } }
TTL Jitter pattern: If many cache keys share the same TTL, they all expire simultaneously → thundering herd. Always add random jitter: TTL = base_ttl + random(0, base_ttl * 0.1). This spreads expiry events across time and prevents concurrent stampedes.
Cache Invalidation
"Only two hard things in CS: cache invalidation and naming things" — Phil Karlton
Three invalidation strategies comparedPATTERNS
Strategy 1: Delete on Write (safest, most common) void updateProduct(id, product) { db.update(id, product); cache.delete("product:" + id); // ← DELETE, not SET cache.delete("products:category:" + product.categoryId); // ← invalidate list too } // Next read misses → fetches fresh from DB → repopulates // Safe: avoids stale-write race condition Strategy 2: Version-Based Keys (no explicit invalidation) // Embed version in key. When data changes, bump version. "product:42:v8" → "product:42:v9" // old key naturally expires via TTL // Application always reads latest version key. // Great for: config data, feature flags, rarely-changing reference data Strategy 3: Event-Based (CDC + Kafka) // DB → Change Data Capture → Kafka → Cache Invalidation Service → Redis.delete(key) DB_writes → Debezium/CDC → Kafka.publish(change_event) → CacheConsumer → cache.delete(key) // Decoupled, real-time, works for distributed caches // Cost: more infrastructure, event delivery latency, ordering guarantees needed
The Invalidation Race Condition — and how to prevent itCONCURRENT
// Timeline of a nasty race: Thread A: UPDATE product WHERE id=42 // starts write Thread B: SELECT product WHERE id=42 → MISS // reads old DB value (before A commits) Thread A: cache.delete("product:42") // invalidates Thread B: cache.set("product:42", stale_val) // writes OLD value back! // Result: cache holds stale data indefinitely until TTL expires 😱 Prevention: 1. Use TTL as a safety net — even if stale, it expires eventually 2. Use CAS (Compare-And-Set) — only write to cache if key still absent cache.setnx("product:42", freshVal) ← only sets if key doesn't exist 3. Use write lock during update: db.beginTransaction(); db.update(id, product); cache.delete(key); // ← delete INSIDE transaction before commit db.commit();
Never "update" cache on write — always delete. If you SET the cache on write and also SET it on read-miss, two concurrent operations can race to write stale vs fresh values. Deleting is atomic and safe — the next read will always get fresh data from the DB.
Cache Stampede (Thundering Herd)
When a popular cache key expires — 10,000 concurrent requests all hit the DB
// STAMPEDE VISUALISED — popular key expires at T=0
DB load
OVERLOAD
Latency
SPIKING
Cache hit
RESTORING
⚡ All concurrent misses pile onto DB simultaneously → overload → latency cascade
Three stampede prevention strategiesPATTERNS
Strategy 1: Mutex/Lock on Miss (most reliable) String get(String key) { String val = cache.get(key); if (val != null) return val; // HIT — fast path // MISS — exactly ONE thread refreshes; others wait boolean locked = cache.setnx("lock:"+key, "1", 10_seconds); if (locked) { try { val = db.get(key); cache.set(key, val, TTL); return val; } finally { cache.delete("lock:"+key); } } else { Thread.sleep(50); return get(key); // Retry — lock holder will populate cache } } Strategy 2: Probabilistic Early Expiration (PER) // Before TTL expires, probabilistically start refreshing if (ttlRemaining < -Math.log(random()) * recomputeTime) { refresh(); // One request triggers early refresh; others get (slightly stale) cached value } Strategy 3: Background Refresh (keep popular keys always warm) // Scheduled job: refresh popular keys 30s before TTL expires // Requires: popularity tracking (hit count per key) scheduler.scheduleAtFixedRate(() -> hotKeys.forEach(k -> cache.refresh(k)), 30, 30, SECONDS);
CDN stale-while-revalidate: Modern CDNs support Cache-Control: stale-while-revalidate=30 — serve stale content immediately while refreshing in background. This eliminates cache misses for CDN-served content entirely. The same principle applies to application-level caches.
Redis Data Structures
Five structures — each optimised for a specific access pattern
String
KEY → VALUE
SET user:42:name "Ajay"
GET user:42:name
INCR view_count:post:123
SETEX session:abc 3600 data
GET user:42:name
INCR view_count:post:123
SETEX session:abc 3600 data
USE: Counters, flags, sessions, rate limit tokens, feature flags
Hash
KEY → {FIELD: VALUE}
HSET user:42 name "Ajay" tier "pro"
HGET user:42 name
HMGET user:42 name tier email
HINCRBY user:42 login_count 1
HGET user:42 name
HMGET user:42 name tier email
HINCRBY user:42 login_count 1
USE: User profiles, config objects, shopping carts — fetch individual fields
List
ORDERED · DUPLICATES OK
RPUSH queue:email email1 email2
LPOP queue:email
LRANGE feed:user:42 0 49
LLEN queue:email
LPOP queue:email
LRANGE feed:user:42 0 49
LLEN queue:email
USE: Job queues (FIFO), activity feeds, recent items, notifications
Set
UNIQUE MEMBERS
SADD active_users 42 88 123
SISMEMBER active_users 42
SUNION tags:post:1 tags:post:2
SINTER followers:A followers:B
SISMEMBER active_users 42
SUNION tags:post:1 tags:post:2
SINTER followers:A followers:B
USE: Tagging, unique visitor tracking, mutual friends, union/intersection
Sorted Set
SCORE-RANKED MEMBERS
ZADD leaderboard 9800 "alice"
ZRANGE lb 0 9 WITHSCORES REV
ZADD rate:user:42 {ts} {reqId}
ZRANGEBYSCORE rate:user:42 (now-60) now
ZRANGE lb 0 9 WITHSCORES REV
ZADD rate:user:42 {ts} {reqId}
ZRANGEBYSCORE rate:user:42 (now-60) now
USE: Leaderboards, sliding window rate limiting, priority queues, geo proximity
Sliding Window Rate Limiter — Redis Sorted SetREDIS
// Per-user: 100 requests per 60-second window // ZADD rate_limit:{userId} {timestamp_ms} {unique_request_id} // ZRANGEBYSCORE rate_limit:{userId} (now-60000) now → requests in last 60s boolean isAllowed(String userId, int limit, long windowMs) { long now = System.currentTimeMillis(); long windowStart = now - windowMs; String key = "rate:" + userId; return redis.pipeline(pipe -> { pipe.zremrangebyscore(key, 0, windowStart); // Remove expired entries pipe.zadd(key, now, UUID.randomUUID().toString()); // Add this request pipe.zcard(key); // Count in window pipe.expire(key, windowMs / 1000); // Auto-expire key }).getCard() <= limit; } // ✅ Exact sliding window · ✅ Multi-server safe (shared Redis) // Memory: O(requests in window) per user — scale with caution for large limits
CDN — Content Delivery Network
Geographic caching at the edge — serving content from nearest node
// WITHOUT CDN vs WITH CDN
No CDN:
User
(Mumbai)
(Mumbai)
──────────────────────→
Origin
(US-East)
(US-East)
~200ms every request
With CDN:
User
(Mumbai)
(Mumbai)
──→
CDN Edge
(Singapore)
(Singapore)
← HIT
~15ms ✅
CDN Edge
(Singapore)
(Singapore)
──MISS──→
Origin
(US-East)
(US-East)
~200ms (first time only)
Cache-Control headers for static vs dynamic vs privateHTTP
// STATIC content (images, CSS, JS, fonts) Cache-Control: max-age=31536000, immutable // CDN caches for 1 year. Use hash-based filenames for cache busting. app.js → app.8f3d92ab.js // Content hash in filename → new deploy = new URL = fresh CDN cache // DYNAMIC content (API responses, partially personalised pages) Cache-Control: max-age=60, stale-while-revalidate=30 // CDN caches for 60s. During next 30s after expiry, serve stale while fetching fresh. // Eliminates miss latency for CDN — no thundering herd at origin. // PRIVATE content (user-specific responses) Cache-Control: private, no-store // CDN does NOT cache. Request always reaches origin. Each user's data is unique. // Vary header — separate cache per encoding/format Vary: Accept-Encoding // CDN stores separate versions for gzip and uncompressed.
| QUESTION | STRONG ANSWER |
|---|---|
| "How does cache-aside work?" | Application checks cache; on miss queries DB and populates with TTL. On write, deletes (not updates) cache key. Delete is safer — avoids stale-write race. |
| "What is thundering herd?" | Popular key expires → thousands of concurrent misses all query DB simultaneously → overload. Prevention: mutex lock on miss, probabilistic early expiration, or background refresh. |
| "Write-through vs write-back?" | Write-through: strong consistency, adds write latency. Write-back: fast writes, risk of data loss if cache crashes before flush. Use write-back for non-critical high-frequency counters (view counts). |
| "Redis data structure for rate limiting?" | Sorted Set with score=timestamp, member=requestId. Sliding window: ZRANGEBYSCORE in (now-window, now) + ZCARD = count in window. Atomic with pipeline. |
| "How to handle cache invalidation?" | Delete on write is safest. TTL as safety net. For complex dependencies, event-based invalidation via CDC + Kafka. Never set stale value — always delete and let next read fetch fresh. |
01
Cache Strategy Selection — 6 Scenarios
›
For each: choose cache-aside / write-through / write-back / no-cache. Justify including the consistency trade-off you're accepting.
- E-commerce product page (price + inventory — overselling is bad)
- Social media post like count (500M likes/day)
- Bank account balance before authorising a debit
- User authentication session (JWT + server-side session)
- Trending hashtags computed every 5 minutes
- Real-time stock price feed (traders need latest price)
For each wrong choice: what bad outcome would it cause?
02
Thread-Safe LRU Cache with TTL
›
API: int get(int key) → -1 if absent or expired void put(int key, int value, long ttlMs) → evict LRU if at capacity Requirements: - O(1) get and put (DoublyLinkedList + HashMap) - Thread-safe: concurrent get/put from multiple threads - TTL per entry: expired entries not returned, evicted lazily on access - Bonus: background sweeper thread evicts expired entries proactively Test: 8 threads × 100K ops, verify: - Zero ConcurrentModificationException - Eviction order correct (LRU evicted first, not MRU) - TTL respected: entries expired after their TTL not returned
03
Cache Stampede Prevention Design
›
A news website homepage is cached in Redis with TTL=60s. The homepage is computed from 50 DB queries taking 500ms. The site has 1M concurrent users.
Problem: Every 60 seconds, a stampede floods the DB with 50K simultaneous queries.
Design a solution that:
- Eliminates the stampede completely
- Keeps homepage latency <50ms for 99% of requests
- Handles Redis failure gracefully (fallback strategy)
- Explain the consistency trade-off you're making
Pick one of the three strategies (mutex, PER, background refresh) and justify why it's best for this scenario.
★
Distributed Rate Limiter — 3 Algorithms
›
Implement a distributed rate limiter: 100 requests per 60-second window per user. Multi-server deployment with shared Redis.
Implement all three and compare:
A) Fixed Window Counter:
INCR rate:{userId}:{minute}
EXPIRE rate:{userId}:{minute} 60
Simple but allows burst at window boundary:
100 at 0:59 + 100 at 1:00 = 200 requests in 2 seconds
B) Sliding Window Log (Sorted Set):
ZADD rate:{userId} {timestamp} {requestId}
ZREMRANGEBYSCORE to remove old entries
ZCARD for count in window
Exact but O(requests) memory per user
C) Sliding Window Counter (hybrid):
Count of previous window × (1 - elapsed/window) + current window count
Approximate but O(1) memory
For each:
- Show Redis commands
- Time complexity per request
- Memory per user
- Accuracy at window boundary
- When to use in production
0 / 13 completedMODULE B3 · CACHING
Know the latency numbers: RAM vs Redis vs DB — WHY cache works
Cache-aside: full read + write path, why delete is safer than update
Write-through vs write-back: trade-offs and when each is appropriate
LRU: DoublyLinkedList + HashMap implementation, O(1) all ops
LFU vs LRU: when LFU is better (avoids cache pollution from scans)
TTL jitter: why it prevents stampede and how to implement
Cache invalidation race condition: why delete is safe, update is not
Cache stampede: 3 prevention strategies (mutex, PER, background refresh)
All 5 Redis data structures + their use cases
Sliding window rate limiter using Redis Sorted Set
CDN: static vs dynamic caching, cache busting, stale-while-revalidate
✏️ Tasks 1–3: strategy selection, LRU with TTL, stampede design
✏️ Capstone: distributed rate limiter — all 3 algorithms implemented
// NEXT MODULE
B4 — Message Queues
Why async messaging · Kafka architecture + partitions + consumer groups
RabbitMQ vs Kafka · At-least-once vs exactly-once · Dead letter queues
Fan-out patterns · Event sourcing · CQRS · Backpressure
RabbitMQ vs Kafka · At-least-once vs exactly-once · Dead letter queues
Fan-out patterns · Event sourcing · CQRS · Backpressure