Module B5 — URL Shortener

Requirements

Functional + Non-Functional — establish scope before drawing any boxes

Functional Requirements

POST /shorten → given long URL, return short code
GET /{code} → redirect to original URL (302)
Custom alias → user specifies their own code
TTL / expiry → optional expiration date per URL
Analytics → click count, referrer, geo (optional)

OUT OF SCOPE: user accounts, dashboard, link editing

Non-Functional Requirements

300M URLs stored total
Write: 100 new URLs/sec (peak: 500/sec)
Read: 10,000 redirects/sec (peak: 50,000/sec)
Read:write ratio = 100:1
Redirect p99 latency < 10ms
Shorten p99 latency < 100ms
Availability: 99.99% (52 min downtime/year)
Durability: URLs must never be lost

Key insight to say aloud: "The 100:1 read:write ratio tells me this is a read-heavy system. My entire architecture will be optimised for fast redirects — nearly every redirect should be served from cache without touching the database."

Capacity Estimation

Numbers first — they drive every architecture decision

METRIC	VALUE	CALCULATION
Write QPS (avg)	100/sec	given requirement
Write QPS (peak)	500/sec	5× average peak factor
Read QPS (avg)	10,000/sec	100:1 ratio × 100 writes/sec
Read QPS (peak)	50,000/sec	5× average peak factor
Storage per URL	~500 bytes	7-char code + 2KB URL + metadata
Total storage	150 GB	300M × 500 bytes
Storage/year	~1.5 TB	100 URLs/sec × 86,400 × 365 × 500B
Short code length	7 chars	62^7 = 3.5 trillion unique codes
Hot cache size	~30 GB	20% of 300M URLs × 500B (80/20 rule)
App servers needed	5–10	50K peak QPS ÷ 10K/server

Key insight to say aloud: "150 GB fits comfortably on a single DB node — no sharding needed. The hot set (30 GB) fits in a Redis cluster. At this scale, the bottleneck is read latency, not storage — hence the 2-layer cache strategy."

High-Level Architecture

Two distinct paths — read (cache-first) and write (ID generation)

SYSTEM ARCHITECTURE — URL SHORTENER

Client

──→

Load Balancer
L7 / HTTPS / SSL termination

──→

API Servers
stateless × 5–10

                                                                 ↙ READ PATH       ↘ WRITE PATH
      

In-Process Cache
Caffeine · 10K items · 5min TTL

miss↓

ID Generator
auto-increment + base62

Redis Cluster
url:{code} → long_url · LRU · 24h TTL

miss↓

MySQL Primary
durable write · ACID

MySQL Replicas ×3
read traffic · async replication

async↓

Kafka: click-events
fire-and-forget analytics

Read Path (redirect — must be <10ms)

Step-by-step redirect flowFLOW

GET /abc1234

1. Check in-process cache (Caffeine): ~100ns
   HIT (60%): return HTTP 302 immediately ✅

2. Check Redis cache: ~0.5ms
   HIT (39%): populate L1 cache, return HTTP 302 ✅

3. Cache MISS (1%): query MySQL read replica: ~5ms
   Populate Redis (TTL 24h), populate L1 (TTL 5min), return HTTP 302 ✅

4. Async (does NOT block redirect):
   Publish click event to Kafka → Analytics consumer updates click_count

Short Code Generation

Four strategies — trade-offs in uniqueness, predictability, and complexity

A: MD5 + Truncation

Hash the long URL → take first 7 chars of base62-encoded hash. Same URL always produces same code (natural dedup).

✓ Deterministic (dedup built-in)
✓ Stateless, no DB for ID

✗ Collision possible at 300M URLs
✗ Must detect + retry on collision
✗ Custom alias not supported

B: Auto-Increment + Base62 ★

DB auto-increment ID → encode to base62. ID 123456789 → "8M0kX". Monotonic → guaranteed unique. Simple.

✓ Guaranteed unique
✓ Simple implementation
✓ Short codes for small IDs

✗ Predictable / enumerable
✗ DB is ID single point of failure
✗ Reveals URL count to scrapers

C: Snowflake-style Distributed ID

64-bit: [41-bit timestamp][10-bit machine][12-bit seq] → base62 encode. ~4096 IDs/ms per machine. No DB dependency.

✓ Globally unique, no DB
✓ Scales to any throughput
✓ Sortable by creation time

✗ Requires machine ID management
✗ Clock skew issues
✗ More complex

D: Pre-generated Pool

Background job pre-generates random 7-char codes, stores as "available". API fetches and marks "used" atomically.

✓ Random → not enumerable
✓ No collision risk
✓ Fast fetch (no computation)

✗ Complex pre-generation logic
✗ Pool management overhead
✗ Wasted codes if URLs deleted

Base62 Implementation

Encode / decode — implement from scratch in interviewJAVA

private static final String ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static final int BASE = 62;

public static String encode(long id) {
    if (id == 0) return "0";
    StringBuilder sb = new StringBuilder();
    while (id > 0) {
        sb.append(ALPHABET.charAt((int)(id % BASE)));
        id /= BASE;
    }
    return sb.reverse().toString();
}

public static long decode(String code) {
    long result = 0;
    for (char c : code.toCharArray()) {
        result = result * BASE + ALPHABET.indexOf(c);
    }
    return result;
}

// encode(0)         → "0"
// encode(62)        → "10"  (like binary: 1×62 + 0)
// encode(123456789) → "8M0kX"
// decode(encode(n)) == n  ✅
// 62^7 = 3,521,614,606,208 ≈ 3.5 trillion unique 7-char codes

Database Schema

MySQL — ACID, B-tree indexes, read replicas for scale

TABLE: urlsMySQL / PostgreSQL

COLUMN

TYPE

NOTES

id

BIGINT

PRIMARY KEY AUTO_INCREMENT — source of short code

short_code

VARCHAR(10)

NOT NULL UNIQUE — the 7-char base62 string

long_url

TEXT

NOT NULL — original URL up to 2048 chars

user_id

BIGINT

NULL = anonymous; FK to users table if auth added

created_at

TIMESTAMP

DEFAULT CURRENT_TIMESTAMP

expires_at

TIMESTAMP

NULL = never expires; checked on every redirect

click_count

BIGINT

DEFAULT 0 — updated async via Kafka consumer

is_active

BOOLEAN

DEFAULT TRUE — soft delete flag

INDEXES

UNIQUE INDEX idx_short_code ON urls(short_code) ← primary lookup, O(log N)

INDEX idx_user ON urls(user_id) ← "my URLs" page

PARTIAL INDEX idx_expires ON urls(expires_at) WHERE expires_at IS NOT NULL ← cleanup job

Why MySQL over NoSQL? 150GB fits on one node. Access pattern is pure primary-key lookup — no scatter-gather needed. ACID prevents duplicate short_code under concurrent inserts. Read replicas handle the 100:1 read:write ratio. NoSQL would work too (DynamoDB keyed on short_code) but adds operational overhead without benefit at this scale.

Caching Strategy

Two layers — achieving 99%+ cache hit rate for redirects

In-Process (L1)

Caffeine cache in JVM heap. Top 10K URLs per server. Zero network. Sub-microsecond lookup.

~100 ns

~60% hit rate

Redis (L2)

Shared across all app servers. Key: "url:{code}" → long_url. 30 GB hot set. allkeys-lru eviction.

~0.5 ms

~39% hit rate

MySQL Replica

Only 1% of traffic reaches DB. SELECT long_url FROM urls WHERE short_code = ? Populate both cache layers on miss.

~5 ms

~1% (cache miss)

Cache read + invalidation patternJAVA

public String getLongUrl(String shortCode) {
    // L1: in-process
    String url = l1Cache.getIfPresent(shortCode);
    if (url != null) return url;

    // L2: Redis
    url = redis.get("url:" + shortCode);
    if (url != null) {
        l1Cache.put(shortCode, url);    // populate L1
        return url;
    }

    // DB: only 1% of traffic
    UrlRecord rec = db.findByShortCode(shortCode);
    if (rec == null || rec.isExpired()) return null;
    redis.setex("url:" + shortCode, 86400, rec.longUrl); // 24h TTL
    l1Cache.put(shortCode, rec.longUrl);
    return rec.longUrl;
}

// Invalidation: on URL delete
public void deleteUrl(String shortCode) {
    db.softDelete(shortCode);
    redis.del("url:" + shortCode);
    l1Cache.invalidate(shortCode);  // invalidate all instances via broadcast
}

301 vs 302 Redirect

An explicit trade-off — analytics vs server load

301

MOVED PERMANENTLY

Browser caches the redirect. Subsequent visits from same browser skip your server entirely.

✓ Browser caches → lower server load
✓ Reduced latency for repeat visits
✓ CDN can cache the response

✗ Can't track repeat clicks
✗ Can't update long URL after caching
✗ Analytics are incomplete

302

FOUND (TEMPORARY) — RECOMMENDED ★

Browser does NOT cache. Every click hits your server. Full analytics visibility.

✓ Full click analytics (every visit)
✓ Can update long URL at any time
✓ A/B testing possible

✗ Higher server load (every click)
✗ Slightly higher latency
✗ No browser-level caching

Interview answer: "I'll use 302 by default because analytics are a core requirement. If a URL provider wants maximum performance and no analytics tracking, we can expose a '301 mode' as an opt-in option. The trade-off is explicit and user-controlled."

Edge Cases & Deep Dives

What separates a good answer from a great one

URL Expiry

URLs with expires_at set should return 410 Gone after expiry.

Lazy check: on redirect, if expires_at < NOW() → 410.
Background job: daily DELETE WHERE expires_at < NOW().
Redis TTL: set same TTL as URL expiry on cache key.

Custom Aliases

User requests /my-campaign. Must be unique, safe, and not clash with system routes.

UNIQUE constraint handles conflict → 409 response.
Blocklist: reserve "api", "health", "admin".
Regex validate: alphanumeric + hyphens only.

Rate Limiting

Prevent URL bombing and redirect abuse from single IPs.

Redis counter: INCR rate:shorten:{ip}:{hour}
EXPIRE 3600 → if >100 → 429 Too Many Requests.
Separate limits for /shorten vs GET /{code}.

Viral URL (Hotkey)

Taylor Swift tweets /abc1234 → 1M req/sec hits one URL.

L1 in-process cache absorbs burst (no Redis roundtrip).
If insufficient: replicate key to N Redis shards.
CDN 301 caching as last resort (lose analytics).

URL Validation

Malicious URLs, redirect chains, oversized inputs.

Parse URL (valid http/https scheme).
Check against phishing/malware blocklist.
Detect redirect loops (sho.rt → sho.rt).
Max long URL: 2048 chars.

Analytics Lag

Kafka consumer falls behind → click_count is stale.

click_count is "approximate" by design — user expectation set.
Monitor consumer lag: alert if >100K backlog.
Scale consumer group (add instances up to numPartitions).

7-Step Framework Applied

How to present URL Shortener in 45 minutes in a FAANG interview

01

REQUIREMENTS (5m)

Shorten + redirect. 300M URLs. 100:1 read:write. <10ms redirect. 99.99% uptime. Analytics optional.

02

ESTIMATION (5m)

150GB storage. 50K peak QPS. 30GB hot set. 3.5T unique codes. No sharding needed at this scale.

03

HLD (10m)

Client→LB→API servers→[L1 cache→Redis→MySQL replica]. Write: API→MySQL primary (auto-ID→base62). Analytics: Kafka async.

04

DEEP DIVE (15m)

Code generation (base62 + auto-increment). 2-layer cache (L1+Redis). Schema with partial index. 302 vs 301 trade-off.

05

BOTTLENECKS (5m)

Cache hit rate (99%+ target). Hotkeys (viral URLs → L1 absorbs). ID generation (MySQL SPOF → use Snowflake at 10× scale).

06

FAILURES (3m)

Redis down: fall through to DB (~5ms, acceptable). DB primary down: serve from cache + replicas. Analytics lag: non-blocking.

07

SCALE UP (2m)

10× URLs → shard by hash(short_code). Multi-region → geo DNS + regional MySQL + cross-region replication. CDN for 301 redirects.

Numbers to say aloud: "150 GB fits on a single node — no sharding." · "62^7 = 3.5 trillion codes — won't exhaust for centuries." · "99% cache hit rate means 1% of redirects touch MySQL." · "302 because analytics is a requirement, with a 301 opt-in for CDN offload."

01

Implement Base62 Encode / Decode

~1 hr

›

Write base62 encode and decode in Java (or your language of choice):

Handle edge cases: encode(0), encode(1), encode(62), encode(62^7 - 1)
Verify roundtrip: assert decode(encode(n)) == n for 1000 random values
How many chars does encode(300_000_000) produce?
Extend: implement MD5 + truncation approach. Write the collision detection retry loop.

02

Extended Schema Design

~1.5 hrs

›

Extend the base schema to support:

User accounts owning URLs — with "my URLs" listing page (paginated, sorted by created_at)
URL groups / campaigns — tag multiple URLs together (e.g., "summer-sale")
Per-URL hourly analytics — click count broken down by hour-of-day
A/B testing — one short code routes 50% to URL-A, 50% to URL-B

For each: write the CREATE TABLE SQL, necessary indexes, and explain the query pattern.

03

Failure Scenario Design

~1.5 hrs

›

Design the failure handling for each scenario. What does the system do? What does the user experience?

Redis cluster completely unavailable
MySQL primary goes down mid-write (during shorten request)
ID generator service (if using external Snowflake service) is unreachable
Analytics Kafka topic has 5-minute lag — consumer backlog of 3M events
A single URL (/abc1234) receives 1M requests/sec (DDoS via viral celebrity tweet)

★

Multi-Region Redesign (US + EU + APAC)

~3 hrs · full redesign

›

Redesign for global deployment with these constraints:

Users in each region get <5ms redirect latency
URLs created in US accessible from EU within 1 second
Unified analytics across all regions
No EU user data may leave EU (GDPR compliance)
System remains available if one entire region goes down

Design: DNS routing strategy, data replication model, consistency choice for URL creation, analytics aggregation, GDPR compliance approach.

0 / 15 completedMODULE B5 · URL SHORTENER

Can state all functional + non-functional requirements from memory

Capacity estimation in <5 min: 150GB, 100:1, 30GB hot set, 3.5T codes

Can draw full architecture diagram: LB → API → L1/L2 cache → DB → Kafka

All 4 short code generation strategies and their trade-offs

Can implement base62 encode/decode from scratch in an interview

DB schema: correct columns, UNIQUE index on short_code, partial index on expires_at

2-layer cache: in-process (L1) + Redis (L2), TTLs, hit rates, invalidation

301 vs 302 trade-off: analytics vs CDN offload — say the trade-off explicitly

Async analytics: Kafka click events, fire-and-forget, doesn't block redirect

Edge cases: expiry, custom aliases, rate limiting, URL validation, hotkeys

Failure handling: Redis down → DB fallback; DB primary down → replicas

Can apply full 7-step framework to URL shortener in 45 minutes

✏️ Task 1: base62 implementation with roundtrip test

✏️ Task 2: extended schema (users, campaigns, analytics, A/B)

✏️ Task 4 (capstone): multi-region redesign with GDPR compliance

NEXT MODULE

B6 — Design Twitter/X Feed

      Social graph · Fan-out on write vs fan-out on read

      Home timeline generation · Celebrity problem · News feed ranking

      Push vs pull model · Redis timeline cache · Hybrid approach