Module B12 — System Design Interview Framework & Mock Interviews

System Design Mastery Course | Track B: HLD | Week 22


🎯 Module Overview

Duration: 1 Week | Track: B — HLD Capstone Prerequisites: B1–B11 (all HLD modules) Goal: All the knowledge from B1–B11 means nothing if you can’t communicate it under interview pressure. This module teaches the structured framework, time management, communication patterns, and gives 6 full mock interview problems — one per day — to practice before any real interview.


1. The 7-Step Framework (45-Minute Interview)

STEP 1 — REQUIREMENTS CLARIFICATION (5 min)
  Functional requirements:  What must the system DO?
  Non-functional requirements: How must it perform? (scale, latency, availability)
  Out of scope: What are you explicitly NOT designing?

  Questions to always ask:
  "How many daily active users?"
  "What is the read:write ratio?"
  "Is this globally distributed or single-region?"
  "What is the acceptable latency for the hot path?"
  "Do we need strong consistency, or is eventual consistency OK?"
  "What is the expected data retention period?"

STEP 2 — CAPACITY ESTIMATION (5 min)
  Storage:   daily_writes × avg_object_size × retention_years
  Throughput: peak_qps = (daily_requests × peak_factor) / 86400
  Bandwidth:  peak_qps × avg_response_size
  Cache:      hot_set = total_data × 0.2 (80/20 rule)

  Key numbers to memorize:
  1 million users × 1 req/day = 12 QPS sustained
  1 billion req/day = ~11,600 QPS = ~12K QPS
  1 KB × 1M rows = 1 GB
  86,400 seconds/day (approx as 100K for rough math)

STEP 3 — HIGH-LEVEL DESIGN (10 min)
  Draw the major components: clients, LB, services, caches, DBs, queues
  Don't over-detail yet — stay at the box-and-arrow level
  Cover: data flow for the primary use case (write path + read path)

STEP 4 — DATA MODEL & API DESIGN (5 min)
  Core tables/documents: fields, primary keys, indexes
  Critical API endpoints: method, URL, request, response
  Don't design ALL tables — only the ones that matter for discussion

STEP 5 — DEEP DIVE (15 min)
  Pick 2–3 components the interviewer cares about most
  Typically: the read hot path, the write bottleneck, or the consistency challenge
  This is where B1–B11 knowledge is applied
  Go deep on trade-offs — show you've thought about alternatives

STEP 6 — BOTTLENECKS & SCALING (4 min)
  Where does your design break as scale increases 10×?
  How would you address each bottleneck?
  Show awareness of: single points of failure, hot spots, thundering herd

STEP 7 — SUMMARY (1 min)
  Restate the key decisions and why
  Mention what you'd do differently with more time

2. Requirements Framework — What to Always Cover

Functional Requirements Template

Write: What does the user CREATE/UPDATE/DELETE?
Read:  What does the user VIEW/SEARCH/QUERY?
The non-obvious ones that impress interviewers:
  - "Should deletes be soft deletes? (data retention compliance)"
  - "Do we need audit logging for GDPR/SOX?"
  - "Should reads be eventually consistent (cheaper) or strongly consistent (expensive)?"

Non-Functional Requirements Template

Scale:        DAU, MAU, writes/sec, reads/sec
Latency:      p50, p99 targets for hot paths
Availability: 99.9% (8.7 hrs/yr downtime) vs 99.99% (52 min/yr)
Durability:   can we lose data? RPO (recovery point objective)
Consistency:  strong vs eventual — and for which operations?
Geo:          single region or multi-region? GDPR constraints?

3. Capacity Estimation Cheat Sheet

Memory units:
  1 char = 1 byte
  UUID = 16 bytes
  int32 = 4 bytes, int64 = 8 bytes
  timestamp = 8 bytes
  average tweet/message = 140 bytes–1 KB
  average photo = 1 MB (thumbnail: 10–50 KB)
  average video = 50–500 MB

Time units:
  1 day = 86,400 seconds ≈ 100K (for estimates)
  1 month ≈ 2.5M seconds
  1 year ≈ 31.5M seconds ≈ 30M

Traffic conversions:
  1M req/day = 12 QPS sustained, 36 QPS peak (3× peak factor)
  10M req/day = 120 QPS sustained, 360 QPS peak
  1B req/day = 12,000 QPS sustained, 36,000 QPS peak

Storage math shortcut:
  1M users × 1 KB/user = 1 GB
  1B users × 1 KB/user = 1 TB
  1B users × 1 MB/user = 1 PB

Server capacity rules of thumb (for estimation):
  1 commodity server: ~10K–20K QPS (simple reads from cache)
  1 commodity server: ~1K–5K QPS (DB reads with disk I/O)
  1 commodity server: ~100–500 QPS (writes with replication)
  1 Redis node: ~100K–500K simple ops/sec
  1 Kafka broker: ~1M messages/sec (1 KB messages)

4. Communication Patterns (What Interviewers Actually Listen For)

State your reasoning before your answer

❌ Bad: "I'll use Cassandra."
✓ Good: "For the write-heavy timeline feed with 50K writes/sec and TTL-based
         expiry, I'd use Cassandra. Its LSM tree handles write throughput well,
         and I can model the timeline as a wide row partitioned by user_id."

Acknowledge trade-offs proactively

❌ Bad: "Redis is the best cache."
✓ Good: "Redis solves the hot-read problem well here, but it adds operational
         complexity — we'll need a cache invalidation strategy and we need to
         size it carefully. The alternative is read replicas, which are simpler
         but higher latency. Given the 10ms SLA, Redis is worth the complexity."

Drive the conversation, don’t wait to be asked

❌ Passive: [waits for interviewer to ask about failures]
✓ Active: "Now let me think about failure modes. The notification service is
           non-critical, so I'll make it async. The payment service is critical
           — I need to think about what happens if it's slow or unavailable..."

5. Common Mistakes (and How to Avoid Them)

MISTAKE 1: Jumping to solutions before clarifying requirements
  Fix: Spend 5 minutes on requirements. Interviewer may reveal constraints
       that completely change the design.

MISTAKE 2: Designing a perfect single-machine system
  Fix: Always think distributed. Ask about scale first.

MISTAKE 3: Avoiding trade-offs
  Fix: Every choice has a cost. Acknowledge trade-offs before being asked.

MISTAKE 4: Not knowing your numbers
  Fix: Do the math. "1B req/day = 12K QPS × 3 peak = 36K peak QPS."

MISTAKE 5: Designing in a vacuum, not collaborating
  Fix: Check in every few minutes. "Does this direction make sense?"

MISTAKE 6: Spending too long on the "obvious" parts
  Fix: Cover basics quickly. Save time for distributed systems problems.

MISTAKE 7: Not handling failures
  Fix: Proactively cover: circuit breakers, retries, dead letter queues.

6. Six Mock Interview Problems

Mock 1 — Design Pastebin / URL Shortener (Warmup)

Scale: 1M pastes/day, 100M reads/day, max paste size 10 MB
Key decisions: ID generation (base62), text in S3 vs DB, expiry, CDN, async analytics
Deep dive: cache tier, ID collision, lazy vs background expiry cleanup
Time: 45 minutes

Mock 2 — Design a Notification System

Scale: 10M notifications/day across 3 channels, delivery within 30 seconds
Key decisions: channel routing, priority, user preferences, retry/fallback, deduplication
Deep dive: retry/fallback strategy, deduplication, delivery receipts
Time: 45 minutes

Mock 3 — Design a Distributed Job Scheduler (Medium-Hard)

Scale: 1M jobs, 1K jobs/sec triggering at peak, at-most-once execution
Key decisions: storage, polling vs time-wheel, leader election, exactly-once, crash recovery
Deep dive: exactly-once execution, partition assignment, failure recovery
Time: 45 minutes

Mock 4 — Design Google Drive / Dropbox (Hard)

Scale: 50M DAU, 100M uploads/day, avg file 500KB, total storage 10 PB
Key decisions: chunked upload (4MB + SHA-256 dedup), delta sync, metadata DB + S3, conflict resolution
Deep dive: chunk deduplication, delta sync algorithm, last-write-wins vs OT
Time: 45 minutes

Mock 5 — Design a Live Streaming Platform (Hard)

Scale: 1K concurrent streamers, 10M viewers, peak 100K viewers per stream, <10s latency
Key decisions: RTMP ingest → HLS transcode, CDN fan-out, WebSocket chat, HyperLogLog viewer count
Deep dive: transcoding pipeline, chat fan-out at 100K viewers, approximate viewer count
Time: 45 minutes

Mock 6 — Design a Search Autocomplete System (Medium)

Scale: 10K QPS autocomplete, 100M unique queries/day in logs
Key decisions: trie vs inverted index, precompute top-N, daily batch update, shard by prefix
Deep dive: trie sharding, pre-computation pipeline, unicode handling
Time: 45 minutes

7. Quick Answers Cheat Sheet

Q: "How do you handle hot partitions?"
A: "Add suffix (0–9) to partition key to spread across 10× more partitions.
    Or: consistent hashing with virtual nodes. Or: cache hot keys in Redis."

Q: "How do you handle thundering herd on cache miss?"
A: "Probabilistic early expiration, or mutex/semaphore on cache miss —
    only one thread refreshes, others wait. Redis: SETNX as a lock."

Q: "How do you achieve exactly-once processing?"
A: "Kafka at-least-once + idempotent consumer. Check idempotency key
    before processing. Transactional Kafka (EOS) for stronger guarantees."

Q: "How do you handle cascading failures?"
A: "Circuit breaker: open after N failures, fail fast, half-open probe after timeout.
    Bulkhead: isolate thread pools per downstream service."

Q: "How would you design for multi-region?"
A: "Active-active with eventual consistency (DynamoDB global tables).
    Or active-passive with failover. Latency-based routing (Route 53)."

Q: "How do you do schema migrations on a live table?"
A: "Expand-contract: add new column (nullable), dual-write to both,
    backfill old rows, switch reads to new column, drop old column.
    Never ALTER TABLE on a large live table — takes a lock."

✅ Completion Checklist


← B11 Distributed TX Notes ⚡ Interactive Module ↑ Roadmap NEXT: B13 ML Systems →