SYSTEM DESIGN MASTERY · TRACK B · MODULE B12 · WEEK 22 INTERVIEW FRAMEWORK · MOCK INTERVIEWS · CAPSTONE
TRACK B CAPSTONE · 45-MINUTE FRAMEWORK · 6 MOCK INTERVIEWS

SYSTEM DESIGN
INTERVIEW
FRAMEWORK

7-STEP FRAMEWORK · TIME MANAGEMENT · CAPACITY MATH
COMMUNICATION PATTERNS · COMMON MISTAKES · MOCK DRILLS
7
FRAMEWORK STEPS
45m
INTERVIEW WINDOW
6
MOCK INTERVIEWS
B12
FINAL MODULE
7-Step Framework
Requirements
Capacity Estimation
Communication
7 Mistakes
6 Mock Problems
Quick Answers
The 7-Step Framework
Use this structure for every system design interview — consistently
1
Requirements Clarification 5 min

Never start designing before you understand the problem. Interviewers will give partial information intentionally.

→ "How many daily active users are we targeting?" → "What is the read-to-write ratio?" → "Is this globally distributed or single-region?" → "What's the acceptable latency for the critical read path?" → "Strong consistency or eventual consistency acceptable?" → "What's the data retention period?"
2
Capacity Estimation 5 min

Rough numbers that constrain your design choices. Do the math out loud — it shows structured thinking.

→ QPS = daily_requests / 86,400 × peak_multiplier (3×) → Storage = writes/day × object_size × retention_years → Cache = total_hot_data × 0.2 (80/20 rule) → Bandwidth = peak_QPS × avg_response_size
3
High-Level Design 10 min

Draw the major components at box-and-arrow level. Cover both write path and read path. Don't over-detail yet.

→ Client → Load Balancer → Service(s) → Cache → DB → Identify: where does data enter, where does it get served → Mention async paths (queues) vs synchronous paths
4
Data Model & API Design 5 min

Only the tables/schemas that matter for your deep dive. Core API endpoints — method, URL, key fields.

→ Don't design ALL tables. 2–3 critical ones only. → Show partition key / shard key choice → API: GET /timeline/{userId}?cursor=X&limit=20
5
Deep Dive 15 min

This is where B1–B11 knowledge pays off. Pick 2–3 hard problems in your design and go deep. Show trade-off thinking.

→ Typical: hot read path, write bottleneck, consistency challenge → "For the fan-out problem, I see two approaches..." → "The cache invalidation here is tricky because..."
6
Bottlenecks & Scaling 4 min

Where does your design break at 10× current scale? Address the biggest SPOFs and hot spots.

→ Single DB primary → add replicas, then shard → Single cache → Redis Cluster → Single region → multi-region with data replication
7
Summary 1 min

Restate the 3 key decisions you made and why. Mention what you'd do with more time. Leave a strong final impression.

→ "The three key decisions were: fan-out-on-write for the feed, Redis for the hot cache layer, and Cassandra for write-heavy storage." → "With more time, I'd explore multi-region replication."
45-Minute Time Map
The distribution that works — stick to it even under pressure
5m
REQUIREMENTS
CLARIFICATION
5m
CAPACITY
ESTIMATION
10m
HIGH-LEVEL
DESIGN
5m
DATA MODEL
& API
15m
DEEP DIVE
← KEY ★
4m
BOTTLENECKS
& SCALING
1m
SUMMARY
& CLOSE
The Deep Dive is where you are judged. Steps 1–4 are table stakes — everyone can draw boxes. Steps 5–6 (deep dive + scaling) is where senior candidates separate themselves. Protect those 15 minutes fiercely. If you're still doing requirements at minute 10, cut it short and move on.
Capacity Estimation Cheat Sheet
The numbers you need to have memorized before any interview
CONVERSIONSHORTCUTEXAMPLE
1M req/day → QPS= 12 QPS sustained; 36 QPS peakInstagram 100M req/day = 1,200 QPS sustained
1B req/day → QPS= 12,000 QPS sustained; 36,000 peakTwitter read traffic ~35K QPS
1 day≈ 86,400s (use 100K for rough math)Never say "there are 1000 minutes in a day"
1M users × 1KB= 1 GB100M users × 500 bytes = 50 GB
1B users × 1KB= 1 TBYouTube metadata: 5B videos × 1KB = 5 TB
1 photo avg= 1 MB (thumbnail: 50KB)Instagram 100M uploads/day = 100 TB/day
1 video avg= 50–500 MBYouTube 500hr/min upload = ~90 TB/day
1 tweet/message= 140 bytes – 1 KBTwitter 500M tweets/day = ~70 GB/day
Worked example: Twitter-scale estimation out loudTECHNIQUE
// Question: Design Twitter. "Let me estimate scale first."

DAU: 300M users
Tweets written: each user tweets 0.5×/day avg → 150M tweets/day
Reads: 100:1 ratio → 15B reads/day → 15B/86400 ≈ 180K read QPS
Writes: 150M/86400 ≈ 1,750 write QPS ≈ 2K write QPS

Storage per tweet: content 140B + metadata 100B + indices ~300B ≈ 550 bytes
Daily storage: 150M × 550B = 82 GB/day
10 years: 82 × 365 × 10 ≈ 300 TB (just tweet text, no media)

Cache for hot tweets: 20% of reads hit 80% of data (Pareto)
Hot set: cache 20% of daily reads = 20% × 180K QPS × avg 500B = ~18 GB hot set

// Now I know: I need a system handling 180K reads/sec, 2K writes/sec,
// ~80 GB/day new storage, 18 GB hot cache. This drives my design choices.
Communication Patterns
What interviewers actually listen for — and what signals seniority
❌ JUNIOR PATTERN
"I'll use Kafka."

"Redis is the best option here."

"MySQL for the database."

[silence — drawing without explanation]

[waits for interviewer to ask about failures]
✓ SENIOR PATTERN
"For 50K write QPS with at-least-once delivery, Kafka fits — though it adds operational complexity."

"Redis solves the hot-read problem here. The trade-off is cache invalidation complexity and sizing."

"For this read:write ratio and need for flexible queries, PostgreSQL — we can shard by user_id later."

"I'm adding a cache here because the read path is 100:1 over writes and most reads are for recent data..."

"Let me think about failure modes. If this service goes down, I want the write path to still work..."
The senior-signal question: Before committing to any technology, say: "I see two approaches here — [A] and [B]. [A] gives us [benefit] but costs [trade-off]. [B] is simpler but doesn't handle [edge case]. Given [constraint from requirements], I'll go with [A]." This shows that you considered alternatives — which is what senior engineers actually do.
7 Common Mistakes
Every one of these has caused otherwise-qualified candidates to fail
1
JUMPING TO SOLUTIONS WITHOUT REQUIREMENTS
Fix: Force yourself to spend 5 full minutes on requirements. Repeat back: "So I'm building a system for X users, Y QPS, with Z consistency requirement — is that correct?" Get explicit confirmation before drawing anything.
2
DESIGNING FOR ONE SERVER
Fix: Always think distributed by default. Even for "simple" systems, ask about scale first. A system with 10K QPS requires load balancing, connection pooling, and at least 2 servers. Assume you need to scale.
3
AVOIDING TRADE-OFFS — "THIS SOLUTION HANDLES EVERYTHING"
Fix: Every technology choice has costs. If you pick Kafka: mention the latency overhead, operational complexity, at-least-once semantics. If you pick Cassandra: mention you can't do JOINs, no strong consistency. Acknowledging trade-offs shows maturity.
4
NOT KNOWING THE NUMBERS
Fix: Memorize the estimation table. Doing math out loud (even approximations) shows discipline. "1B req/day ÷ 100K seconds = 10K QPS × 3 peak = 30K peak QPS" said in 10 seconds is far more impressive than "should be fine."
5
DESIGNING ALONE, NOT COLLABORATING
Fix: Check in every 5–8 minutes. "I'm thinking of separating read and write paths here — does that seem like the right direction to explore?" Interviewing is a collaborative exercise. Interviewers want to see how you work with others.
6
SPENDING 20 MINUTES ON CRUD
Fix: Cover the obvious parts quickly: "Standard REST CRUD, JWT auth, HTTPS everywhere — 2 minutes." Save your time for the hard distributed systems problems: fan-out, consistency, failure handling. That's where the interview is won or lost.
7
NO FAILURE HANDLING IN THE DESIGN
Fix: Proactively discuss failures. "What if the cache is unavailable? I'd add a circuit breaker and fall back to DB reads." "What if the notification service is slow? I'd make it async with a queue and retry." Cover the top 2–3 failure scenarios before being asked.
6 Mock Interview Problems
One per day — timed, 45 minutes, no notes until after
MOCK 1 · DAY 1 · WARMUP
Design Pastebin / URL Shortener
EASY
1M pastes/day · 100M reads/day · max 10 MB paste
Key decisions: ID generation (base62), text in S3 vs DB, expiry strategy, CDN for large pastes, async analytics.
Deep dive: cache tier for popular pastes, ID collision handling, lazy vs background expiry cleanup
MOCK 2 · DAY 2 · COMPONENT
Design a Notification System
MEDIUM
10M notifications/day · push + email + SMS · <30s delivery
Key decisions: channel routing (FCM/APNs/SES/Twilio), priority queues, user preferences, retry + fallback logic, deduplication.
Deep dive: retry/fallback when push fails → SMS, dedup across channels, delivery receipts
MOCK 3 · DAY 3 · DISTRIBUTED
Distributed Job Scheduler
MEDIUM-HARD
1M jobs · 1K jobs/sec peak triggering · at-most-once execution
Key decisions: storage (DB partitioned by scheduled_time), time-wheel vs priority queue, leader election, exactly-once execution, dead job recovery.
Deep dive: preventing two nodes from running same job, crash recovery, cron expression parsing
MOCK 4 · DAY 4 · HARD
Design Google Drive / Dropbox
HARD
50M DAU · 100M uploads/day · avg 500KB · 10 PB total
Key decisions: chunked upload (4MB chunks, SHA-256 dedup), delta sync (changed chunks only), metadata DB + S3, sync protocol, conflict resolution.
Deep dive: chunk deduplication across users, delta sync algorithm, last-write-wins vs OT conflict resolution
MOCK 5 · DAY 5 · HARD
Live Streaming Platform
HARD
1K streamers · 10M viewers · 100K viewers/top stream · <10s latency
Key decisions: RTMP ingest → HLS transcode, CDN fan-out, WebSocket chat, viewer count (HyperLogLog), HLS vs WebRTC latency trade-off.
Deep dive: transcoding pipeline parallelism, chat fan-out at 100K viewers, approximate viewer count
MOCK 6 · DAY 6 · SYNTHESIS
Search Autocomplete
MEDIUM
10K autocomplete QPS · 100M unique queries/day in logs · top-10 suggestions
Key decisions: trie vs inverted index, precompute top-N per prefix, daily batch update pipeline from logs, shard trie by prefix range.
Deep dive: trie sharding, pre-computation vs on-the-fly, unicode + multilingual handling
Practice protocol: Set a 45-minute timer. Draw on paper or a whiteboard. No notes. After time is up, review against the module notes and identify the 2–3 things you missed. Do NOT review the answer before attempting — the discomfort of not knowing is the practice.
Quick Answer Cheat Sheet
Interviewer probes — have these answers ready in 30 seconds
"How do you handle hot partitions / hot keys?"
Add a random suffix (0–9) to the partition key to spread load across 10× more partitions. Combine results at read time. Alternatively, cache hot keys separately in Redis with a short TTL. For write-heavy hot keys, use a write-behind cache. Consistent hashing with virtual nodes also mitigates hot spots by distributing keys more evenly.
"How do you prevent thundering herd on cache expiry?"
Three approaches: (1) Probabilistic early expiration — with probability proportional to time-to-expiry, refresh early before expiry hits all at once. (2) Mutex/lock on cache miss — first thread refreshes, others wait. Use Redis SETNX as a distributed lock with short TTL. (3) Background refresh — proactively refresh popular keys before expiry, keeping them always warm.
"How do you achieve exactly-once processing in Kafka?"
Kafka delivers at-least-once by default. To achieve effectively-once: use idempotent consumers — check an idempotency key in the DB before processing, and store it atomically with the result. For stricter needs, Kafka Transactions (EOS — exactly-once semantics) enable atomic produce+consume operations. The outbox pattern (B11) combined with idempotent consumers gives effectively-exactly-once end-to-end.
"How do you handle cascading failures between services?"
Circuit breaker pattern: after N consecutive failures to a downstream service, the circuit "opens" — requests fail fast without hitting the service. After a timeout, a "half-open" probe is sent; if it succeeds, circuit closes. Bulkhead pattern: isolate thread pools per downstream service so one slow service doesn't exhaust the shared pool. Timeout + retry with exponential backoff for transient failures. Fallback: cached result, degraded response, or graceful error.
"How would you design for multi-region?"
Active-active: both regions serve reads and writes, data replicated asynchronously (DynamoDB Global Tables, CockroachDB). Conflict resolution needed. Active-passive: primary region handles writes, secondary is hot standby — failover when primary goes down. Latency-based routing (Route 53) sends users to nearest region. GDPR: EU user data must stay in EU region — use regional data classification. RPO/RTO: async replication has seconds of potential data loss (RPO); failover automation targets minutes (RTO).
"How do you do zero-downtime schema migrations?"
Expand-contract (also called parallel change): Step 1 — Add new column (nullable, no default); existing code ignores it. Step 2 — Dual-write: new code writes to both old and new columns. Step 3 — Backfill: migrate old data to new column via background job. Step 4 — Switch reads: code now reads from new column. Step 5 — Stop writing to old column. Step 6 — Drop old column. Never run ALTER TABLE on a large live table without this — it takes an exclusive lock and blocks all queries.
"How do you ensure high availability for stateful services?"
Run multiple instances behind a load balancer. For sessions: externalize state to Redis (stateless app servers). For leader election (e.g., Saga Orchestrator): use Zookeeper ephemeral nodes or etcd leases — leader holds a lease, followers compete to acquire it on expiry. Health checks: remove unhealthy instances from rotation within 10–30 seconds. DB: primary-replica with automatic failover (RDS Multi-AZ, Patroni for Postgres).
0 / 13 completedMODULE B12 · INTERVIEW FRAMEWORK
7-step framework memorized — can recite steps + time allocations without notes
6 requirements questions to always ask (DAU, ratio, latency, consistency, geo, retention)
Capacity math: 1B/day = 12K QPS, 1M users × 1KB = 1GB, 1 photo = 1MB
Communication: state reasoning before answer, proactive trade-offs, drive conversation
7 mistakes internalized — know what symptom looks like and how to avoid it
Quick answers: hot partitions, thundering herd, exactly-once, cascading failures, multi-region
✏️ Mock 1 (Pastebin) — 45-min timed session completed
✏️ Mock 2 (Notifications) — 45-min timed session completed
✏️ Mock 3 (Job Scheduler) — 45-min timed session completed
✏️ Mock 4 (Google Drive) — 45-min timed session completed
✏️ Mock 5 (Live Streaming) — 45-min timed session completed
✏️ Mock 6 (Autocomplete) — 45-min timed session completed
All 6 mocks reviewed against module notes — gaps identified and studied
TRACK B — COMPLETE
Track B: HLD Mastered
B1 Fundamentals · B2 Databases · B3 Caching · B4 Message Queues
B5 URL Shortener · B6 Twitter Feed · B7 WhatsApp · B8 YouTube
B9 Rate Limiter · B10 Consistent Hashing · B11 Distributed Tx · B12 Interview Framework
NEXT: Track A (LLD) Complete · Track B (HLD) Complete · Ready for FAANG Interviews