M07 — NoSQL: Redis, MongoDB & Cassandra
NoSQL taxonomy, Redis data structures with complexity guarantees, caching patterns, persistence & pub/sub, Lua scripting, rate limiting, MongoDB aggregation pipeline, Cassandra data modeling by access pattern, and hiredis in C.
Phase 2 — Databases & Storage ~5 hrs| Category | Model | Flagship | Strength | Weakness | Sweet Spot |
|---|---|---|---|---|---|
| Key-Value | Hash map | Redis, DynamoDB | Sub-ms reads; simple | No relational query | Sessions, caches, counters |
| Document | JSON/BSON tree | MongoDB, Couchbase | Flexible schema; rich queries | Multi-doc transactions costly | Catalogs, user profiles, CMS |
| Wide-Column | Partition → rows | Cassandra, HBase | Write-optimized; linear scale | Query-driven design required | Time-series, IoT, activity feeds |
| Graph | Vertices + edges | Neo4j, Amazon Neptune | Relationship traversal | Doesn't scale as wide | Social graphs, recommendations |
| Time-Series | Timestamped metrics | InfluxDB, TimescaleDB | Compression, retention policies | Poor ad-hoc relational queries | Monitoring, telemetry |
In a network partition you must choose between Consistency and Availability. You can never have all three simultaneously.
Consistency
(every read sees
latest write)
/\
/ \
/ \
/ CA \ ← no real-world distributed system
/--------\
/ CP | AP \
/ | \
Partition ─────────── Availability
Tolerance (always responds,
(nodes can may be stale)
fail/split)
CP examples: HBase, Zookeeper, Redis Cluster (default)
AP examples: Cassandra (tunable), DynamoDB, CouchDB
CA example: Single-node PostgreSQL (no partition tolerance)
- Data is naturally relational with many joins
- ACID transactions across multiple entities are required
- Schema is stable and well-understood
- Complex ad-hoc reporting or analytics
- Team is familiar with SQL tooling
- Access pattern is known and narrow (read by key)
- Horizontal scale > vertical scale (write throughput)
- Schema evolves rapidly (document stores)
- Geo-distributed with tunable consistency
- Specific data model fits: graph, time-series, cache
Client 1 ──┐
Client 2 ──┤ TCP socket ┌─────────────────────────────────┐
Client 3 ──┼───────────────►│ I/O Multiplexer (epoll/kqueue) │
... │ │ ───────────────────────────── │
└───────────────►│ Command Queue (FIFO) │
│ ───────────────────────────── │
│ Single Worker Thread │
│ executes commands serially │
│ → no locking needed │
│ ───────────────────────────── │
│ In-Memory Data Structures │
│ (dict, quicklist, listpack, │
│ skiplist, rax, stream) │
└─────────────────────────────────┘
│
Background threads:
• AOF fsync
• RDB fork + write
• Lazy free (UNLINK)
KEYS * on a large dataset) blocks all other clients. Never run KEYS in production — use SCAN with a cursor instead.| Type | Key Commands | Complexity | Internal Encoding | Use Case |
|---|---|---|---|---|
| String | SET/GET, INCR, MSET/MGET, SETNX, SETEX |
O(1) | SDS (simple dynamic string) | Counters, cache values, distributed lock tokens |
| Hash | HSET/HGET, HMSET, HGETALL, HINCRBY, HDEL |
O(1) field ops; O(n) HGETALL | listpack (small) → dict (large) | User profile fields, config objects |
| List | LPUSH/RPUSH, LPOP/RPOP, LRANGE, BLPOP |
O(1) push/pop; O(n) LRANGE | listpack (small) → quicklist | Task queues, recent activity feeds |
| Set | SADD/SREM, SISMEMBER, SUNION/SINTER/SDIFF |
O(1) SADD/SREM; O(n) SUNION | listpack (small) → intset → dict | Unique visitors, tags, friend lists |
| Sorted Set | ZADD, ZRANGE, ZRANGEBYSCORE, ZRANK, ZINCRBY |
O(log n) ZADD/ZRANK; O(log n + m) ZRANGE | listpack (small) → skiplist + dict | Leaderboards, delayed job queues, rate limiting windows |
| Stream | XADD, XREAD, XRANGE, XGROUP CREATE, XACK |
O(1) XADD; O(n) XRANGE | listpack + rax (radix tree) | Event log, message bus, audit trail |
| Bitmap | SETBIT/GETBIT, BITCOUNT, BITOP |
O(1) SETBIT; O(n) BITCOUNT | String (bit-addressed) | Feature flags, daily active user tracking |
| HyperLogLog | PFADD, PFCOUNT, PFMERGE |
O(1), uses ~12KB max | Probabilistic sketch | Unique count estimates (± 0.81% error) |
user:42:profile hash, setting TTL on the hash expires ALL fields at once. There is no per-field TTL.| Dangerous Command | Why Dangerous | Safe Alternative |
|---|---|---|
KEYS pattern | O(n) — scans all keys; blocks event loop | SCAN 0 MATCH pattern COUNT 100 |
FLUSHALL | Deletes every key in all databases | SCAN + targeted DEL; or FLUSHDB ASYNC |
DEBUG SLEEP | Explicitly blocks the server | Never in production |
DEL big-key | O(n) — synchronous deletion blocks loop | UNLINK big-key (async lazy-free) |
LRANGE 0 -1 on huge list | Transfers entire list over network | Paginate with LRANGE 0 99, then next page |
SMEMBERS big-set | O(n) — returns all members | SSCAN with cursor |
Application Cache (Redis) Database
│ │ │
│──── GET user:42 ────────────►│ │
│ │ MISS │
│◄─── nil ─────────────────────│ │
│ │ │
│──── SELECT * FROM users ───────────────────────────►│
│◄─── row {id:42, name:...} ──────────────────────────│
│ │ │
│──── SET user:42 ... EX 300 ─►│ │
│ │ stored │
│ (later) │ │
│──── GET user:42 ────────────►│ │
│◄─── {id:42, name:...} ───────│ HIT — no DB call │
| Pattern | Who Manages Cache | On Read Miss | On Write | Consistency | Best For |
|---|---|---|---|---|---|
| Cache-Aside | Application | App reads DB, populates cache | App updates DB, deletes/updates cache | Eventual (TTL-bounded) | General-purpose, read-heavy |
| Read-Through | Cache library/proxy | Cache fetches from DB automatically | App writes to cache; cache syncs DB | Eventual | When you can plug in a cache provider |
| Write-Through | Cache library/proxy | Cache fetches from DB | Write to cache AND DB synchronously | Strong (but higher write latency) | Read-heavy, strong consistency needed |
| Write-Behind | Cache library/proxy | Cache fetches from DB | Write to cache only; async flush to DB | Weak (data loss risk on crash) | Write-heavy, can tolerate brief loss |
| Policy | Behavior | When to Use |
|---|---|---|
noeviction | Return error on write when memory full | When data loss is unacceptable (primary store) |
allkeys-lru | Evict least-recently-used key across all keys | General cache — frequently accessed items stay hot |
volatile-lru | Evict LRU key only from keys with TTL set | Mix of persistent + cached keys in same instance |
allkeys-lfu | Evict least-frequently-used (Redis 4+) | Better than LRU for skewed access distributions |
volatile-ttl | Evict key with shortest remaining TTL first | When shorter-lived items are more "disposable" |
allkeys-random | Evict random key — no intelligence | Uniform access patterns (rare) |
Scenario: popular cache key expires at T=0
T=0: 100 requests hit cache → all MISS
→ all 100 simultaneously query DB
→ DB overwhelmed, latency spikes
Fix 1: Probabilistic early re-computation
When remaining TTL < threshold: randomly re-cache
→ one request re-caches while others still get old value
Fix 2: Lock / Mutex (Redis SET NX)
First miss acquires distributed lock → fetches DB
Others wait → then all read from cache (or retry)
Fix 3: Background refresh
Scheduled job refreshes cache before TTL expires
→ cache never actually empty for popular keys
| RDB (Redis Database) | AOF (Append-Only File) | |
|---|---|---|
| Mechanism | Periodic fork + full memory snapshot to .rdb file | Log every write command; replay on restart |
| Trigger | save 900 1 (after 1 change in 15 min), BGSAVE | Every write; fsync configurable |
| Restart speed | Fast (load binary snapshot) | Slow if AOF is huge (replay all commands) |
| Data loss risk | Up to snapshot interval (minutes) | Up to 1 second (appendfsync everysec) |
| File size | Compact binary | Grows; periodically compacted with BGREWRITEAOF |
| Production rec. | Use for backups / fast restarts | Use for durability (near-zero data loss) |
appendfsync everysec for durability. Redis docs call this "the best of both worlds."
Publisher Redis Subscribers
│ │ │
│── PUBLISH notifications │ │
│ "{"type":"like",...}" ─►│ │
│ │──► "{"type":"like",...}" ─►│ Sub A
│ │──► "{"type":"like",...}" ─►│ Sub B
│ │ │
│ (publisher doesn't know │ (no message persistence │
│ who is subscribed) │ — if sub is offline, │
│ │ message is LOST) │
XADD/XREAD/XGROUP).Redis executes Lua scripts atomically — no other command runs between script operations. This is the safe way to implement read-modify-write patterns without transactions.
Pattern 1 — Fixed Window (INCR + EXPIRE)
Pattern 2 — Sliding Window (Sorted Set)
Embedding vs Referencing — the core schema decision:
| Embed when… | Reference when… |
|---|---|
| Data is always accessed together (post + author preview) | Data has its own lifecycle independent of parent |
| The embedded array has bounded size (≤ a few hundred items) | Array could grow unbounded (post comments → millions) |
| Update pattern writes the whole document | Many documents share the same sub-document |
Collection → [$match] → [$lookup] → [$unwind] → [$group] → [$sort] → [$limit] → Result
filter join flatten aggregate order paginate
$match stages as early as possible in the pipeline to reduce documents flowing through subsequent stages. MongoDB can use indexes for the first $match stage.
Cassandra Cluster (3 nodes, replication_factor=3)
Write path:
Client → Coordinator Node
→ hash(partition_key) → token ring → target nodes
→ Commit Log (WAL) + Memtable
→ Memtable flush → SSTable on disk
Read path:
Client → Coordinator → target nodes
→ Row Cache (if enabled)
→ Bloom Filter (fast "definitely not here" check)
→ Key Cache → SSTable index → SSTable data
Compaction:
SSTables merge periodically → remove tombstones (deletes)
→ smaller read amplification
Token Ring (consistent hashing):
Each node owns a range of tokens
Replication: each row copied to RF=3 consecutive nodes
Coordinator routes any write to correct nodes
Cassandra schema design is query-driven: design your table for one specific query. Joins do not exist; denormalization is expected.
- Determines which node(s) store the row
- All rows with same partition key → same partition
- Must appear in every query (no full-table scans)
- Keep partitions balanced — hot partition = hot node
- Partition size limit: ~100 MB recommended
- Defines sort order within a partition
- Enables range queries on clustering columns
- Can query
WHERE created_at > Xwithin a partition - Cannot skip clustering keys in WHERE clause
- Choose DESC if you mostly read recent data first
| Level | Writes to | Reads from | Tradeoff |
|---|---|---|---|
ONE | 1 node | 1 node | Fastest; may read stale data |
QUORUM | RF/2+1 nodes | RF/2+1 nodes | Strong consistency (write+read quorum overlap); balanced |
LOCAL_QUORUM | Quorum in local DC | Quorum in local DC | Strong consistency within DC; avoids cross-DC latency |
ALL | All RF nodes | All RF nodes | Strongest; unavailable if any node down |
ANY | At least 1 (hint OK) | N/A (write only) | Highest availability; weakest durability |
Example with RF=3: QUORUM write (2) + QUORUM read (2) = 4 > 3 ✓ → guaranteed to see latest write.
| Anti-Pattern | Why It Fails | Fix |
|---|---|---|
| ALLOW FILTERING in queries | Forces full partition scan; slow at scale | Redesign table for the query; use secondary index carefully |
| Unbounded partition growth | Single partition → single node bottleneck; >2GB bad | Add time bucket to partition key (user_id + year_month) |
| High-cardinality secondary indexes | Distributed index = scatter-gather on every node | Materialize a separate table for each query pattern |
| Large IN queries | Coordinator fans out to many nodes; serial waits | Async parallel queries; smaller batch sizes |
| Logged batches for performance | Batches add coordinator overhead; not for performance, only for atomicity across tables | Use unlogged batches only for same-partition multi-row writes |
| Redis | MongoDB | Cassandra | |
|---|---|---|---|
| Model | Key-value / data structures | Document (BSON) | Wide-column (partitioned rows) |
| Query | Key lookup; limited range | Rich ad-hoc; aggregation pipeline | Query-driven; CQL; no ad-hoc |
| Transactions | MULTI/EXEC; Lua scripts; limited | Multi-document ACID (v4+) | Lightweight transactions (LWT); limited |
| Scale | Cluster mode (hash slots) | Replica sets; sharded clusters | Linear horizontal scale; no master |
| Consistency | Strong within shard | Strong (primary); eventual (secondaries) | Tunable per query (ONE → ALL) |
| Best for | Caching, sessions, rate limiting | Flexible catalogs, CMS, user data | Time-series, activity feeds, IoT |
hiredis is the official, lightweight C client for Redis. It provides a synchronous API for simple use cases and an async API (libevent/libev/libuv adapters) for non-blocking I/O.
Without pipelining (N commands = N round-trips): Client ──SET──► Server ──OK──► Client ──INCR──► Server ──1──► Client ... RTT: N × (50ms) = 500ms for 10 commands With pipelining (N commands = 1 round-trip): Client ──[SET, INCR, HSET, ...]──► Server Server ──[OK, 1, 3, ...]──────────► Client RTT: 1 × 50ms = 50ms for 10 commands
Goal: Add cache-aside caching to an Express.js API, measure cache hit rate.
docker run -p 6379:6379 redis:7-alpineGET /users/:id that hits a PostgreSQL DB.cache:hits and cache:misses counters in Redis.wrk -t4 -c100 -d30s http://localhost:3000/users/42.redis-cli GET cache:hits vs GET cache:misses. Expect >95% hits after warm-up.Goal: Implement a sliding-window rate limiter in Redis; test boundary behavior.
isAllowedSliding(ip, limit=10, windowMs=60000) function using Redis sorted sets.429 Too Many Requests with Retry-After header when over limit.ZSCORE ratelimit:sliding:127.0.0.1 — confirm timestamps are in sorted set.Goal: Build an aggregation pipeline that computes per-tag post counts and total views.
mongosh using a seed script. Include varied tags and view counts.$match (published) → $unwind (tags) → $group (count, sumViews) → $sort → $limit 10..explain("executionStats") — confirm the initial $match uses an index.{status:1, publishedAt:-1} and re-run explain — compare totalDocsExamined.report:top-tags with TTL 3600. Serve from cache on repeat requests.- Explain the five NoSQL categories and give a use case for each
- Describe CAP theorem and classify Redis, MongoDB, and Cassandra under it
- List all six Redis data types, their O() complexities, and one use case each
- Implement cache-aside pattern with correct TTL invalidation on write
- Contrast cache-aside vs write-through vs write-behind consistency guarantees
- Explain cache stampede and implement the SET NX lock pattern to prevent it
- Configure Redis maxmemory and choose appropriate eviction policy
- Describe RDB vs AOF persistence trade-offs and recommend correct production config
- Implement a fixed-window rate limiter using INCR + EXPIRE
- Implement a sliding-window rate limiter using a Sorted Set
- Use Lua scripting in Redis for an atomic read-modify-write operation
- Explain MongoDB embedding vs referencing decision criteria
- Write a MongoDB aggregation pipeline with $match, $lookup, $group, $sort
- Define Cassandra partition key, clustering key, and explain query-driven design
- Calculate whether QUORUM reads + QUORUM writes guarantee strong consistency for a given RF