Phase 0 — Foundation Primer

How the Internet Works

Topic 0.1 · DNS · TCP/IP · HTTP · WebSocket

DNS

The Phone Book

Hierarchical distributed cache. TTL controls freshness. Propagation lag matters for zero-downtime deploys.

Browser cache
→ OS cache
→ Recursive Resolver
→ Root NS → TLD NS
→ Authoritative NS

TCP/IP

The Reliable Pipe

Ordered reliable delivery via 3-way handshake, ACKs, retransmission, flow + congestion control.

SYN → SYN-ACK → ACK
[connected]

TCP: reliable → HTTP, DB
UDP: fast → video, DNS

HTTP Methods

Idempotency

GET    idempotent, safe
POST   NOT idempotent
PUT    idempotent
PATCH  NOT idempotent
DELETE idempotent

HTTP Status Codes — Memorize These

2xx SUCCESS

200 OK
201 Created
204 No Content

3xx REDIRECT

301 Moved Permanently
302 Found
304 Not Modified

4xx CLIENT

400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
429 Too Many Reqs

5xx SERVER

500 Internal Error
502 Bad Gateway
503 Unavailable
504 Gateway Timeout

01When asked "walk me through what happens when you type a URL" — DNS is step 1. Interviewers notice when candidates skip it.

02429 (rate limiting) and 503 (service overload) appear in nearly every SD discussion. Know them cold.

03Use WebSocket for real-time bidirectional (chat, live scores). Use long-polling for infrequent server push. Never WebSocket for simple CRUD — unnecessary overhead.

OS Fundamentals

Topic 0.2 · Processes · Threads · I/O Models · Memory Latency

Process

+ Independent memory space
+ Crash isolation
+ IPC: pipes, sockets
- Heavy context switch ~1-10μs
- Expensive to spawn

Thread

+ Shared heap memory
+ Fast context switch
+ Cheap to spawn
- One crash kills process
- Needs synchronization

Blocking vs Non-Blocking I/O

❌ Blocking

Thread → I/O request
Thread WAITS (idle)
I/O completes
Thread continues

10K connections
= 10K idle threads
= ~10 GB RAM wasted

✅ Non-Blocking

Thread → I/O request
Thread continues work
OS notifies on complete
(callback / future)

10K connections
= 1 thread handles all
= minimal RAM used

Memory Latency — MEMORIZE THIS

These numbers are your interview ammunition. Know them by heart.

Registers

~1 ns

L1 Cache

~4 ns

L2 Cache

~12 ns

L3 Cache

~40 ns

RAM ★

~100 ns

SSD

~100 μs

HDD

~10 ms

Network LAN

~1 ms

Network WAN

~100 ms

★RAM is 1,000× faster than SSD and 100,000× faster than HDD. This single fact justifies every caching layer ever built.

Database Basics

Topic 0.3 · ACID · Indexing · Isolation Levels · SQL vs NoSQL

ACID Properties

A

Atomicity

All succeed or ALL fail. No partial writes.

C

Consistency

DB moves between valid states. Constraints hold.

I

Isolation

Concurrent tx don't interfere with each other.

D

Durability

Committed data survives crashes (WAL).

Bank transfer: Debit $100 from A → Credit $100 to B
Without Atomicity:  Debit ✓ + Credit ✕ = $100 disappears
Without Isolation:  Third tx reads mid-transfer = wrong balance

B-Tree Indexing

No index:
SELECT * WHERE email='x@y.com'
→ full table scan → O(n)

With B-tree index:
→ tree traversal → O(log n)

Composite index (A,B,C):
✓ query on A
✓ query on (A,B)
✓ query on (A,B,C)
✕ query on B alone

Isolation Levels

Level	Dirty?	NRR?	Speed
Read Uncommitted	Yes	Yes	Fastest
Read Committed	No	Yes	Fast
Repeatable Read	No	No	Moderate
Serializable	No	No	Slowest

ⓘMost DBs default to Read Committed or Repeatable Read.

SQL vs NoSQL — First Mental Model

Dimension	SQL	NoSQL
ACID	✓ Full	⚠ Varies
Joins	✓ Native	✕ Limited
Horizontal scale	✕ Hard	✓ Native
Schema	Rigid	Flexible
Best for	Finance, accounts, inventory	Feeds, sessions, IoT, search

→Deep dive in Module B6: sharding, replication, leader election. This is just the mental model.

The SD Interview Framework

Topic 0.4 · 5 steps · 45 minutes · Applied to every system in this course

5

REQUIREMENTS

+

5

ESTIMATION

+

15

HLD

+

15

DEEP DIVE

+

5

TRADE-OFFS

=

45 min

TOTAL

1

Clarify Requirements

~5 min — Never skip this step

Functional: Top 3 features? Who are users? Read/write ratio? Priority flows?

Non-functional: DAU/QPS/data? Availability SLA (99.9% = 8.7 hr downtime/yr)? Latency target? Consistency? Global or single-region?

Tip: Interviewers leave requirements vague intentionally. Asking these IS the evaluation.

2

Back-of-Envelope Estimation

~5 min — Numbers drive architecture

Write QPS = DAU × writes/day / 86,400 | Read QPS = DAU × reads/day / 86,400
Peak QPS = Avg × 3 | Storage/day = writes/day × avg size | Storage/5yr = /day × 1,825

Rule: Every number must drive a design decision. If it doesn't — skip the calculation.

3

High-Level Design

~15 min — System at 30,000 feet

Standard topology: Client → CDN → Load Balancer → App Servers → Cache → Database → Queue → Workers

Rule: Every box you draw must be justifiable. Don't add components for decoration.

4

Deep Dive (2–3 components)

~15 min — Go deep where it matters

Common targets: DB schema + indexing · Cache eviction/invalidation · Queue delivery guarantees · API + rate limiting · Sharding strategy

Tip: Lead toward your strongest area. "I'd like to deep dive sharding — that's where the interesting trade-offs are."

5

Trade-offs & Bottlenecks

~5 min — Close every design with this

1. Bottlenecks: SPOFs, hot partitions, slow queries
2. Trade-offs: What you sacrificed (consistency vs availability, cost vs performance)
3. Next steps: Monitoring, alerting, gradual rollout

Tip: Proactively identifying weaknesses signals maturity. Never claim your design is perfect.

Estimation Calculator

Topic 0.5 · Interactive 7-metric tool · Use for every system you design

System Parameters

Daily Active Users

Writes per user/day

Reads per user/day

Avg write size (bytes)

Avg read response (bytes)

Retention years

Mental Math Shortcuts

86,400 sec/day ≈ 100,000   (always round up)
1M DAU × 1 req/day  ≈ 12 QPS
1B DAU × 1 req/day  ≈ 12,000 QPS
1 KB × 1M writes/day = 1 GB/day
1 MB × 1M writes/day = 1 TB/day
Peak ≈ Average × 3

Trade-off Summary

Core decision frameworks — referenced throughout the entire course

Concept

Option A

Option B

Decision Rule

TCP vs UDP

TCP
Reliable, ordered

UDP
Fast, unreliable

TCP when correctness > speed. UDP when latency > reliability.

Blocking vs Async I/O

Blocking
Simple code

Non-blocking
High throughput

Async when handling 100s+ concurrent connections.

Process vs Thread

Process
Crash isolation

Thread
Shared memory

Process when stability matters. Thread when performance is critical.

Index vs No Index

Index
Fast reads O(log n)

No Index
Fast writes

Index WHERE, JOIN, ORDER BY columns. Don't over-index.

SQL vs NoSQL

SQL
ACID, joins

NoSQL
Scale, flex

SQL for finance/accounts. NoSQL for feeds/sessions/IoT.

Consistency

Strong
Always correct

Eventual
Always available

Strong for banking/inventory. Eventual for likes/feeds/counters.

★The master rule: "It depends" is always the right start. Follow with "on X and Y — let me clarify the requirements." This single habit separates strong SD candidates from weak ones.

Phase 0 Completion Checklist

Track your progress · Click items to mark complete

      Progress0 / 9 complete
    

Trace URL end-to-end: DNS → TCP handshake → HTTP → Response
TCP vs UDP tradeoffs cold — when to use each without hesitation
Explain blocking vs non-blocking I/O with a concurrency comparison + math
Memorized latency table: RAM (~100ns) → SSD (~100μs) → HDD (~10ms)
Explain all 4 ACID properties with a bank transfer example
Design a DB schema with justified indexes for a simple system
✏ Task 0.4: Applied the 5-step SD Framework to Pastebin
✏ Task 0.5: Estimation sprint — Instagram, WhatsApp, YouTube tables complete
Can fill the 7-metric estimation table for any system in under 5 minutes