Backend Engineering Roadmap
A structured, hands-on path from web fundamentals to production-grade distributed systems — with C/C++ examples, concept checklists, and interactive progress tracking.
8Phases
80Concepts
8Code Examples
∞Depth
Overall Progress
0%
0 of 80 concepts checked
Ph0
How the Web Works
0/8 checked
0%
Key Concepts
- DNS resolution: recursive vs iterative queries, TTL, caching chain (browser → OS → recursive resolver → root nameserver → TLD → authoritative)
- TCP three-way handshake: SYN → SYN-ACK → ACK; connection teardown FIN/FIN-ACK/ACK; RST for abrupt close
- TLS 1.3 handshake: ClientHello (supported ciphers + key_share), ServerHello + Certificate + CertificateVerify, Finished; ECDHE forward secrecy
- HTTP/1.1: persistent connections (Keep-Alive), pipelining, head-of-line blocking at TCP layer
- HTTP/2: binary framing, multiplexing (multiple streams over single TCP), HPACK header compression, server push; still has TCP HOL blocking
- HTTP/3 + QUIC: runs over UDP, built-in TLS 1.3, independent streams (no HOL blocking), 0-RTT resumption
- Web server accept loop: listen socket +
SO_REUSEADDR/SO_REUSEPORT,accept()blocks until client connects; thread-per-request (Apache) vs event loop (Nginx/epoll) - Backend request lifecycle: accept → parse HTTP → route to handler → middleware chain → business logic → DB query → serialize response → send
Technologies & Tools
TCP/IP
TLS 1.3
HTTP/1.1
HTTP/2
HTTP/3
QUIC
DNS
Wireshark
/* Minimal TCP server skeleton — illustrates accept loop */
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <string.h>
int main(void) {
int srv = socket(AF_INET, SOCK_STREAM, 0);
int opt = 1;
setsockopt(srv, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_port = htons(8080);
addr.sin_addr.s_addr = INADDR_ANY;
bind(srv, (struct sockaddr *)&addr, sizeof(addr));
listen(srv, SOMAXCONN); /* SOMAXCONN = OS backlog limit */
while (1) {
int client = accept(srv, NULL, NULL); /* blocks until client connects */
/* hand off: thread-per-request -> pthread_create() */
/* event loop -> epoll_ctl(ADD) */
close(client);
}
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | DNS resolution and caching chain (resolver → root → TLD → authoritative) | Network | |
| 2 | TCP 3-way handshake and connection teardown (FIN/RST) | Network | |
| 3 | TLS 1.3 handshake: ECDHE key exchange and forward secrecy | Security | |
| 4 | HTTP/1.1: persistent connections, pipelining, HOL blocking | HTTP | |
| 5 | HTTP/2: multiplexing, binary framing, HPACK compression | HTTP | |
| 6 | HTTP/3 + QUIC: UDP-based, independent streams, 0-RTT | HTTP | |
| 7 | Web server accept loop: thread-per-request vs event loop | Server | |
| 8 | Backend request lifecycle: accept → route → middleware → handler → DB → respond | Server |
Ph1
API Design & Contracts
0/10 checked
0%
Key Concepts
- REST principles: resources as nouns (not verbs), stateless client-server, uniform interface, cacheable responses, layered system, optional HATEOAS
- URL & versioning: plural nouns (
/usersnot/user), nested resources (/users/42/orders), versioning strategies — URI prefix (/v1/), Accept header (application/vnd.api+json;version=1), query param (?version=1) - HTTP method semantics: GET/HEAD (safe + idempotent), PUT/DELETE (idempotent, not safe), POST (neither), PATCH (partial update, should be idempotent in practice)
- Status codes: 200 OK, 201 Created, 204 No Content, 301/302/304 redirects, 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 Conflict, 422 Unprocessable Entity, 429 Too Many Requests, 500/502/503/504
- Request/response shaping: direct payload vs envelope (
{data, meta, links}), consistent error objects, snake_case vs camelCase field naming - Pagination: offset+limit (simple, but skips/duplicates on concurrent writes), cursor-based (stable, no skips), keyset pagination (most scalable); include
total_count,next_cursorin response - Error standard: RFC 7807 Problem Details —
type(URI),title,status,detail,instancefields; consistent error envelope across all endpoints - OpenAPI/Swagger: spec-first design philosophy, YAML schema,
$reffor reusable components, code generation for servers (stub) and clients (SDK) - gRPC: Protocol Buffers IDL (
syntax = "proto3"), service + rpc definitions, unary vs client-streaming vs server-streaming vs bidirectional-streaming; when to prefer (internal services, streaming, strong typing) - GraphQL: schema-first (SDL), resolvers, queries (read) vs mutations (write) vs subscriptions (realtime), N+1 problem (DataLoader batching solution)
Technologies & Tools
REST
gRPC
GraphQL
OpenAPI
Protobuf
Swagger
RFC 7807
/* user.proto — gRPC service definition */
// syntax = "proto3";
//
// service UserService {
// rpc GetUser (GetUserRequest) returns (UserResponse); // unary
// rpc WatchUser (GetUserRequest) returns (stream UserResponse); // server-streaming
// }
//
// message GetUserRequest { string user_id = 1; }
//
// message UserResponse {
// string user_id = 1;
// string username = 2;
// string email = 3;
// int64 created_at = 4; // Unix timestamp
// }
/* Minimal HTTP/1.1 response builder in C */
#include <stdio.h>
#include <string.h>
#include <unistd.h>
void send_json(int fd, int status, const char *body) {
char header[512];
int body_len = (int)strlen(body);
snprintf(header, sizeof(header),
"HTTP/1.1 %d OK\r\n"
"Content-Type: application/json\r\n"
"Content-Length: %d\r\n"
"Connection: close\r\n"
"\r\n",
status, body_len);
write(fd, header, strlen(header));
write(fd, body, body_len);
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | REST constraints: statelessness, uniform interface, resource naming | REST | |
| 2 | URL design: plural nouns, nesting, versioning strategies (/v1/, header, query) | REST | |
| 3 | HTTP method idempotency: GET/PUT/DELETE vs POST/PATCH semantics | REST | |
| 4 | HTTP status code families and when to use each (2xx/3xx/4xx/5xx) | REST | |
| 5 | Pagination: offset vs cursor vs keyset — tradeoffs for each | REST | |
| 6 | RFC 7807 Problem Details: type, title, status, detail, instance | API Design | |
| 7 | OpenAPI spec-first design and code generation workflow | API Design | |
| 8 | gRPC: Protobuf IDL, service definition, 4 streaming modes | gRPC | |
| 9 | When to choose gRPC over REST (internal, streaming, strong typing) | gRPC | |
| 10 | GraphQL: schema, resolvers, N+1 problem and DataLoader solution | GraphQL |
Ph2
Databases & Storage
0/12 checked
0%
Key Concepts
- Relational schema design: normalization (1NF removes repeating groups, 2NF removes partial dependencies, 3NF removes transitive dependencies), ERD, foreign key constraints, check constraints
- Indexes: B-tree (default, ordered, range queries), hash (equality only), composite (left-prefix rule), covering (index-only scan), partial/filtered; index selectivity; write amplification tradeoff
- Query plans:
EXPLAIN ANALYZE(actual rows, actual time), sequential scan vs index scan vs index-only scan, join algorithms (nested loop, hash join, merge join), planner statistics - Transactions & ACID: Atomicity (all-or-nothing), Consistency (invariants preserved), Isolation (concurrent txns don't interfere), Durability (committed = persisted to WAL/disk)
- Isolation levels: Read Uncommitted (dirty reads), Read Committed (default PostgreSQL), Repeatable Read (no phantom in MySQL InnoDB via MVCC), Serializable (SSI in PostgreSQL); phenomena: dirty read, non-repeatable read, phantom read
- Deadlocks: detection (wait-for graph cycle), prevention (lock ordering — always acquire locks in same order), lock timeout (
lock_timeout = '2s'in PostgreSQL),SKIP LOCKEDfor queue patterns - NoSQL taxonomy: document (MongoDB — flexible schema, nested objects), key-value (Redis — sub-ms latency), wide-column (Cassandra — write-optimized, partitioned by key), time-series (InfluxDB), graph (Neo4j); choose by access pattern
- Redis data structures: string (counters, cache), hash (object fields), list (queues, stacks), set (unique members), sorted set (leaderboards, rate limiting), stream (event log); each with O() complexity
- Caching patterns: cache-aside/lazy loading (app reads cache first, on miss reads DB and populates cache), read-through (cache fetches from DB), write-through (write to cache + DB sync), write-behind (async DB write)
- Redis advanced: persistence modes (RDB — snapshot at intervals, AOF — append-only log, both for durability), pub/sub (fire-and-forget), Lua scripting (atomic multi-command), rate limiting with
INCR+EXPIRE - Connection pooling: why (TCP + auth handshake cost per connection), Little's Law (avg connections = arrival_rate × avg_latency), pgBouncer modes (session/transaction/statement), pool exhaustion and backpressure
- Database migrations: versioned sequential scripts (Flyway/Liquibase pattern), forward-only vs rollback scripts, zero-downtime techniques: expand-contract (add column nullable → backfill → add NOT NULL → drop old column)
Technologies & Tools
PostgreSQL
MySQL
Redis
MongoDB
Cassandra
pgBouncer
hiredis
libpq
/* PostgreSQL via libpq — parameterized query (prevents SQL injection) */
#include <libpq-fe.h>
#include <stdio.h>
void fetch_user(PGconn *conn, const char *user_id) {
const char *params[1] = { user_id };
PGresult *res = PQexecParams(conn,
"SELECT id, name, email FROM users WHERE id = $1",
1, /* nParams */
NULL, /* paramTypes (let server infer) */
params, NULL, NULL,
0 /* result format: text */
);
if (PQresultStatus(res) == PGRES_TUPLES_OK && PQntuples(res) > 0) {
printf("id=%-6s name=%-20s email=%s\n",
PQgetvalue(res, 0, 0),
PQgetvalue(res, 0, 1),
PQgetvalue(res, 0, 2));
}
PQclear(res);
}
/* Redis cache-aside via hiredis */
#include <hiredis/hiredis.h>
#include <string.h>
/* Returns cached JSON string or NULL (caller must free reply) */
redisReply *get_user_cached(redisContext *rc, PGconn *pg,
const char *user_id)
{
char key[64];
snprintf(key, sizeof(key), "user:%s", user_id);
redisReply *r = redisCommand(rc, "GET %s", key);
if (r && r->type == REDIS_REPLY_STRING)
return r; /* cache HIT */
freeReplyObject(r);
/* cache MISS — query DB, then SET with 5-min TTL */
/* fetch_user(pg, user_id) -> serialize to JSON -> */
/* redisCommand(rc, "SET %s %s EX 300", key, json_val) */
return NULL;
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | Relational schema normalization: 1NF, 2NF, 3NF and when to denormalize | SQL | |
| 2 | Index types: B-tree, hash, composite, covering, partial; left-prefix rule | SQL | |
| 3 | Query plans: EXPLAIN ANALYZE, sequential scan vs index scan | SQL | |
| 4 | ACID properties and what each guarantees | Transactions | |
| 5 | Isolation levels: RC, RR, Serializable; dirty/phantom/non-repeatable reads | Transactions | |
| 6 | Deadlocks: detection, lock ordering prevention, SKIP LOCKED | Transactions | |
| 7 | NoSQL taxonomy: document, key-value, wide-column, time-series, graph — when to use | NoSQL | |
| 8 | Redis data structures and time complexity of each | Redis | |
| 9 | Caching patterns: cache-aside, read-through, write-through, write-behind | Caching | |
| 10 | Redis persistence: RDB vs AOF; pub/sub; Lua atomicity | Redis | |
| 11 | Connection pooling: Little's Law, pgBouncer modes, pool exhaustion | Performance | |
| 12 | Zero-downtime migrations: expand-contract pattern | Migrations |
Ph3
Authentication & Authorization
0/8 checked
0%
Key Concepts
- Session-based auth: server stores session state (in-memory or Redis), session ID in HttpOnly+Secure cookie, session fixation attack (regenerate session ID on login), CSRF protection (SameSite=Strict or CSRF token)
- JWT structure: three base64url-encoded sections — header (alg, typ), payload (claims), signature; standard claims:
iss(issuer),sub(subject),aud(audience),exp(expiry),nbf(not before),iat(issued at),jti(JWT ID for revocation) - JWT signing algorithms: HS256 (HMAC-SHA256, shared secret — symmetric, all services need secret), RS256 (RSA — private key signs, public key verifies — asymmetric, safe to distribute public key), ES256 (ECDSA, smaller keys than RSA)
- Access + refresh token pattern: short-lived access token (15min–1hr, stateless validation), long-lived refresh token (7–30 days, stored in DB, one-time-use rotation, allows revocation)
- OAuth2 flows: Authorization Code + PKCE (for SPAs and mobile — code verifier/challenge prevents interception), Client Credentials (machine-to-machine, no user), Device Code (CLI/TV apps — user visits URL on phone)
- API Keys: generation (crypto/rand CSPRNG → hex or base62 encoding), never store plaintext (store SHA-256 hash + prefix for lookup), scoping to specific resources/operations, key rotation strategy
- RBAC vs ABAC: Role-Based (user has role, role has permissions — simple, coarse-grained), Attribute-Based (policy: ALLOW if subject.dept == resource.dept AND action == "read" — flexible, complex); hybrid (RBAC for coarse, ABAC for fine-grained)
- Password storage: why fast hashes are wrong (MD5/SHA256: billions/sec on GPU), bcrypt (configurable cost factor, ~100ms target), Argon2id (OWASP recommended — memory-hard, time-hard, side-channel resistant), always timing-safe comparison (constant-time memcmp)
Technologies & Tools
JWT
OAuth2
OpenSSL
bcrypt
Argon2
Redis (sessions)
PKCE
/* JWT HMAC-SHA256 signature verification (OpenSSL) */
#include <openssl/hmac.h>
#include <openssl/evp.h>
#include <string.h>
#include <stdio.h>
/* Compare two byte arrays in constant time to prevent timing attacks */
static int const_time_cmp(const unsigned char *a,
const unsigned char *b, size_t len) {
unsigned char diff = 0;
for (size_t i = 0; i < len; i++)
diff |= a[i] ^ b[i];
return diff == 0;
}
/* Verify HS256: header_payload = "base64url(hdr).base64url(payload)" */
int jwt_verify_hs256(const char *header_payload,
const unsigned char *expected_sig, size_t sig_len,
const unsigned char *secret, size_t secret_len)
{
unsigned char digest[EVP_MAX_MD_SIZE];
unsigned int digest_len = 0;
HMAC(EVP_sha256(),
secret, (int)secret_len,
(const unsigned char *)header_payload, strlen(header_payload),
digest, &digest_len);
if (digest_len != sig_len) return 0;
return const_time_cmp(digest, expected_sig, digest_len);
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | Session-based auth: HttpOnly cookie, Redis-backed sessions, session fixation | Auth | |
| 2 | JWT structure: header.payload.signature, standard claims (iss, sub, exp, jti) | JWT | |
| 3 | JWT signing: HS256 vs RS256 vs ES256 — symmetric vs asymmetric tradeoffs | JWT | |
| 4 | Access + refresh token pattern: rotation, revocation, short-lived access tokens | JWT | |
| 5 | OAuth2 flows: Authorization Code + PKCE, Client Credentials, Device Code | OAuth2 | |
| 6 | API Keys: CSPRNG generation, hashing at rest, scoping, rotation | API Security | |
| 7 | RBAC vs ABAC: coarse-grained roles vs attribute-based policy evaluation | Authorization | |
| 8 | Password storage: bcrypt cost factor, Argon2id memory-hardness, timing-safe compare | Security |
Ph4
Concurrency & Performance
0/10 checked
0%
Key Concepts
- Threading models: thread-per-request (simple, high memory — 8KB stack × 10K = 80MB+), thread pool with bounded queue (Apache worker MPM), event loop + I/O multiplexing (Nginx, Node.js), green threads/goroutines (M:N userspace scheduling)
- Synchronization primitives: mutex (exclusive lock, binary), RW lock (multiple concurrent readers OR single writer), semaphore (counting lock, rate limiting), condition variable (wait for predicate — always pair with mutex), spinlock (busy-wait, only for very short critical sections on multi-core)
- Lock-free programming: compare-and-swap (CAS) — atomically: if (*ptr == expected) { *ptr = desired; return true; }, ABA problem (use versioned pointers), GCC
__atomicbuiltins (__atomic_compare_exchange_n), C11stdatomic.h - I/O multiplexing evolution:
select(FD_SET bitmap, 1024 fd limit),poll(no fd limit, linear scan),epoll(Linux — O(1) notification, edge-triggered ET vs level-triggered LT, epoll_create1/epoll_ctl/epoll_wait),io_uring(Linux 5.1+ — async submit+complete ring buffers, zero-copy, no syscall per I/O) - C10K problem: 10,000 concurrent connections — why thread-per-request fails (OS scheduling overhead, stack memory), how epoll event loop solves it (single thread handles thousands of FDs)
- In-process caching: LRU eviction (doubly-linked list + hash map = O(1) get/put), LFU (min-heap of frequency buckets), cache capacity planning (hot data << cold data)
- Distributed caching with Redis: cache stampede (thundering herd when TTL expires simultaneously — mutex lock, probabilistic early expiry, background refresh), hotspot key sharding, client-side consistent hashing
- Connection pool management: pool exhaustion (queue vs reject vs timeout), health checks on idle connections (keepalive probe or validation query), backpressure signals upstream
- Load balancing algorithms: round-robin (uniform distribution), weighted round-robin (heterogeneous backends), least-connections (for variable request durations), IP hash (session stickiness, avoid with horizontal scaling), consistent hashing (minimal key redistribution when nodes added/removed)
- Horizontal scaling design: stateless services (no server-side session), shared-nothing architecture, externalizing state (Redis, DB), idempotent operations (safe to retry), eventual consistency tradeoffs
Technologies & Tools
epoll
io_uring
pthreads
stdatomic.h
Redis
pgBouncer
/* epoll edge-triggered event loop skeleton (Linux) */
#include <sys/epoll.h>
#include <sys/socket.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#define MAX_EVENTS 128
static void set_nonblocking(int fd) {
int flags = fcntl(fd, F_GETFL, 0);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}
void run_event_loop(int listen_fd) {
int epfd = epoll_create1(EPOLL_CLOEXEC);
struct epoll_event ev;
ev.events = EPOLLIN;
ev.data.fd = listen_fd;
epoll_ctl(epfd, EPOLL_CTL_ADD, listen_fd, &ev);
struct epoll_event events[MAX_EVENTS];
while (1) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1 /* block forever */);
for (int i = 0; i < n; i++) {
if (events[i].data.fd == listen_fd) {
/* New connection */
int client = accept(listen_fd, NULL, NULL);
set_nonblocking(client);
ev.events = EPOLLIN | EPOLLET; /* edge-triggered */
ev.data.fd = client;
epoll_ctl(epfd, EPOLL_CTL_ADD, client, &ev);
} else {
/* Data available — handle_client(events[i].data.fd) */
/* With ET: must read until EAGAIN */
}
}
}
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | Threading models: thread-per-request vs thread pool vs event loop vs green threads | Concurrency | |
| 2 | Mutex, RW lock, semaphore, condition variable — when to use each | Concurrency | |
| 3 | Lock-free CAS: compare-and-swap, ABA problem, GCC __atomic builtins | Concurrency | |
| 4 | I/O multiplexing: select → poll → epoll (ET vs LT) → io_uring evolution | I/O | |
| 5 | C10K problem: why threads fail at scale, how epoll solves it | I/O | |
| 6 | In-process caching: LRU (linked list + hash map), LFU, eviction policies | Caching | |
| 7 | Cache stampede: thundering herd, mutex lock, probabilistic early expiry | Caching | |
| 8 | Connection pool exhaustion: queue vs reject, backpressure, health checks | Performance | |
| 9 | Load balancing algorithms: round-robin, least-conn, consistent hashing | Scaling | |
| 10 | Stateless design: externalizing state for horizontal scaling | Scaling |
Ph5
Event-Driven Architecture
0/10 checked
0%
Key Concepts
- Why event-driven: temporal decoupling (producer/consumer run independently), fanout (one event → many consumers), audit log (full history replayable), reduces synchronous blocking chains, enables eventual consistency
- Message queues vs event streams: RabbitMQ (work queue model — message consumed and deleted, at-most-once or at-least-once via acks, competing consumers, dead-letter exchange) vs Kafka (persistent log — messages retained, consumer groups replay from offset, unlimited retention)
- Kafka internals: topic partitioned across brokers, each partition is an ordered immutable log; leader partition + replicas (In-Sync Replicas ISR); producer assigns partition (key hash or round-robin); consumer group — each partition consumed by exactly one consumer in group; offset committed by consumer
- Kafka delivery semantics: at-most-once (
acks=0, fire-and-forget), at-least-once (acks=all+ retry — may duplicate), exactly-once (idempotent producer + transactions —enable.idempotence=true+transactional.id) - RabbitMQ patterns: direct exchange (routing key match), topic exchange (routing key pattern
*.error), fanout exchange (broadcast to all bound queues), headers exchange; dead-letter exchange (DLX) for failed messages; message TTL; priority queues - Saga pattern: managing distributed transactions without 2PC; orchestration (central Saga Orchestrator sends commands, receives events, handles compensations), choreography (each service reacts to events and emits new events); compensating transactions roll back completed steps
- Outbox pattern: write event to outbox table in same DB transaction as business data (atomicity), separate Relay/CDC process polls outbox and publishes to broker, mark as published; prevents lost events on crash between DB write and broker publish
- CQRS (Command Query Responsibility Segregation): write side (commands mutate state, normalized DB optimized for writes), read side (queries return projections, denormalized read model optimized for reads); sync via domain events or CDC; eventual consistency between models
- Event Sourcing: system state = ordered log of immutable domain events (not current state snapshot); reconstruct any past state by replaying events; snapshots for performance (don't replay full history); projections for derived read models; event schema versioning challenge
- Idempotent consumers: natural idempotency (PUT/DELETE — repeated calls have same effect), deduplication table (store processed event IDs, reject duplicates), atomic check-and-process with DB transaction; combine with outbox for exactly-once end-to-end
Technologies & Tools
Kafka
RabbitMQ
librdkafka
Apache Pulsar
NATS
/* Kafka producer using librdkafka (C client) */
#include <librdkafka/rdkafka.h>
#include <string.h>
#include <stdio.h>
static void delivery_cb(rd_kafka_t *rk, const rd_kafka_message_t *msg,
void *opaque) {
(void)rk; (void)opaque;
if (msg->err)
fprintf(stderr, "Delivery failed: %s\n",
rd_kafka_err2str(msg->err));
}
void produce_event(const char *brokers, const char *topic,
const char *key, const char *value) {
char errstr[512];
rd_kafka_conf_t *conf = rd_kafka_conf_new();
rd_kafka_conf_set(conf, "bootstrap.servers", brokers,
errstr, sizeof(errstr));
rd_kafka_conf_set_dr_msg_cb(conf, delivery_cb);
rd_kafka_t *rk = rd_kafka_new(RD_KAFKA_PRODUCER, conf,
errstr, sizeof(errstr));
rd_kafka_topic_t *rkt = rd_kafka_topic_new(rk, topic, NULL);
retry:
if (rd_kafka_produce(rkt,
RD_KAFKA_PARTITION_UA, /* auto-select partition by key hash */
RD_KAFKA_MSG_F_COPY, /* copy payload into rdkafka */
(void *)value, strlen(value),
key, strlen(key),
NULL) == -1) {
if (rd_kafka_last_error() == RD_KAFKA_RESP_ERR__QUEUE_FULL) {
rd_kafka_poll(rk, 100); /* drain delivery queue */
goto retry;
}
}
rd_kafka_flush(rk, 10000); /* wait up to 10s for delivery */
rd_kafka_topic_destroy(rkt);
rd_kafka_destroy(rk);
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | Why events: temporal decoupling, audit log, fanout, replay capability | Architecture | |
| 2 | Message queue vs event stream: RabbitMQ (work queue, delete on consume) vs Kafka (log, retain+replay) | Architecture | |
| 3 | Kafka internals: topics, partitions, offsets, ISR, consumer group rebalancing | Kafka | |
| 4 | Delivery semantics: at-most-once, at-least-once, exactly-once (idempotent producer) | Kafka | |
| 5 | RabbitMQ patterns: exchange types, DLX, message TTL | Events | |
| 6 | Saga pattern: orchestration vs choreography, compensating transactions | Patterns | |
| 7 | Outbox pattern: atomic write to outbox table, relay publishes to broker | Patterns | |
| 8 | CQRS: separate write model (commands) from read model (projections) | Patterns | |
| 9 | Event Sourcing: state as event log, snapshots, projections, schema versioning | Patterns | |
| 10 | Idempotent consumers: dedup table, atomic check-and-process | Patterns |
Ph6
Microservices & Infrastructure
0/10 checked
0%
Key Concepts
- Monolith vs microservices decision: start with modular monolith, split when team topology demands it (Conway's Law), bounded contexts (DDD) define service boundaries; microservices add operational complexity — don't split prematurely
- Strangler Fig pattern: incrementally replace monolith — route specific URL paths to new service at API gateway, coexist with monolith during migration, deprecate monolith module by module; avoids big-bang rewrite risk
- Inter-service communication strategy: sync REST/gRPC (simple, tight coupling, propagates latency) vs async events (loose coupling, eventual consistency, harder to debug); request-reply over async via correlation ID in message header
- API Gateway responsibilities: single entry point, path-based routing to backend services, authentication/authorization offload (validate JWT before forwarding), rate limiting and throttling, SSL termination, request aggregation (backend for frontend pattern), canary traffic splitting
- Service discovery: client-side (service queries registry like Consul/Eureka, client chooses instance — more control), server-side (load balancer queries registry — simpler client), DNS-based (Kubernetes Services use kube-dns)
- Circuit breaker: closed state (normal, count failures), open state (fail fast immediately — no calls to unhealthy service, prevents cascade), half-open state (allow probe requests to test recovery); bulkhead pattern (isolate resource pools per service)
- Docker best practices for C/C++: multi-stage build (Stage 1: gcc:13 builder compiles binary, Stage 2: debian:slim runtime copies binary — minimal image size), non-root user (
useradd -r),.dockerignore(exclude build artifacts), pin base image versions,ENTRYPOINTvsCMD - Kubernetes fundamentals: Pod (smallest deployable unit, co-located containers), Deployment (manages ReplicaSet, rolling updates, rollback), Service (stable DNS name + ClusterIP load balancing), Ingress (HTTP/S routing + TLS termination), ConfigMap (non-secret config), Secret (base64-encoded credentials), liveness probe (restart if unhealthy), readiness probe (remove from Service endpoints if not ready)
- CI/CD pipeline stages: lint → unit test → integration test → build OCI image → push to registry → deploy to staging → smoke test → deploy to production; blue-green (two identical environments, instant cutover); canary (route 5% → 20% → 100% traffic to new version)
- 12-Factor App: I-Codebase (one repo, many deploys), II-Dependencies (explicitly declared), III-Config (env vars, not hardcoded), IV-Backing services (attached resources, swap without code change), V-Build/release/run (strict separation), VI-Processes (stateless, share nothing), VII-Port binding, VIII-Concurrency (scale out via process model), IX-Disposability (fast startup, graceful shutdown), X-Dev/prod parity, XI-Logs (stdout, not files), XII-Admin processes
Technologies & Tools
Docker
Kubernetes
Consul
Nginx
Helm
GitHub Actions
ArgoCD
# -- Stage 1: Build (fat image with full toolchain) --
FROM gcc:13 AS builder
WORKDIR /src
# Copy source and build system first (layer caching)
COPY Makefile ./
COPY src/ ./src/
# Build release binary (strip debug symbols)
RUN make release CFLAGS="-O2 -DNDEBUG" && strip bin/server
# -- Stage 2: Minimal runtime --
FROM debian:bookworm-slim
RUN apt-get update \
&& apt-get install -y --no-install-recommends libpq5 ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Non-root user for security
RUN useradd -r -u 1001 -s /sbin/nologin appuser
WORKDIR /app
COPY --from=builder /src/bin/server .
USER appuser
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:8080/health || exit 1
ENTRYPOINT ["./server"]Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | Monolith vs microservices: Conway's Law, bounded contexts, modular monolith first | Architecture | |
| 2 | Strangler Fig: incremental migration via API gateway routing | Architecture | |
| 3 | Sync vs async inter-service communication: tradeoffs, correlation ID pattern | Communication | |
| 4 | API Gateway: routing, auth offload, rate limiting, BFF pattern | Infra | |
| 5 | Service discovery: client-side (Consul) vs server-side vs DNS-based (K8s) | Infra | |
| 6 | Circuit breaker: closed/open/half-open states, bulkhead pattern | Reliability | |
| 7 | Docker multi-stage build for C/C++: builder → slim runtime, non-root user | Docker | |
| 8 | Kubernetes: Pod, Deployment, Service, Ingress, liveness vs readiness probe | K8s | |
| 9 | CI/CD: pipeline stages, blue-green deployment, canary traffic splitting | CI/CD | |
| 10 | 12-Factor App: config via env, stateless processes, stdout logs | Best Practices |
Ph7
Observability & Hardening
0/12 checked
0%
Key Concepts
- 3 pillars of observability: logs (discrete events — what happened), metrics (aggregated numeric data — how many/how fast), traces (causal chains across services — why it was slow); each answers different questions; together give full system visibility
- Structured logging: JSON lines format (one JSON object per line), mandatory fields (timestamp ISO-8601, level, service, trace_id, span_id, message), log levels (DEBUG verbose, INFO normal, WARN degraded, ERROR unexpected failure, FATAL unrecoverable), never log secrets or PII, use correlation/trace IDs to link logs across services
- Metrics types: counter (monotonically increasing, e.g.,
http_requests_total— userate()), gauge (point-in-time value, e.g.,memory_usage_bytes,active_connections), histogram (bucketed distribution, e.g.,request_duration_seconds— usehistogram_quantile()for p99) - RED method (for services): Rate (requests/second), Errors (error rate %), Duration (latency percentiles p50/p95/p99); USE method (for resources): Utilization (% time busy), Saturation (queue depth, wait time), Errors (device error rate)
- Prometheus: pull-based scraping (Prometheus polls
/metricsendpoint on services), exposition format (# HELP,# TYPE,metric_name{labels} value timestamp), PromQL (rate(http_requests_total[5m]),histogram_quantile(0.99, ...),by(service)), AlertManager for alerting rules and routing - Distributed tracing: trace (end-to-end request chain, unique trace_id), span (single operation within trace, span_id + parent_span_id), W3C
traceparentheader for cross-service propagation, OpenTelemetry SDK (language-agnostic instrumentation, OTLP export to Jaeger/Tempo/Zipkin) - Health check endpoints:
GET /health/live— liveness probe (is process alive? if fails, Kubernetes restarts container),GET /health/ready— readiness probe (is service ready to serve traffic? if fails, removed from Service endpoints); startup probe for slow-starting containers - Rate limiting algorithms: token bucket (bucket refills at rate r, allow bursts up to capacity b — bursty traffic ok), sliding window log (store timestamps of all requests, exact but memory O(requests)), sliding window counter (approximate, memory O(1), compromise); implement at API gateway (global) and per-service (defense in depth)
- OWASP Top 10 for backends: SQL injection (parameterized queries only — never string concat), command injection (avoid shell=True / system(), use execv), SSRF (Server-Side Request Forgery — allowlist outbound URLs), broken access control (check authorization on every request, not just auth), security misconfiguration (disable debug endpoints in prod, no default credentials), insecure deserialization (validate and sanitize all deserialized input)
- Input validation: allowlist over denylist (define what is allowed, reject everything else), validate at trust boundaries only (never trust client input), size limits (prevent DoS via large payloads — max body size), type checking, sanitize before SQL/shell/HTML context
- Secrets management: never in source code or Docker images (scan with truffleHog/gitleaks), environment variables (basic, visible in
/proc/PID/environ— acceptable for containers), HashiCorp Vault (dynamic secrets with TTL + auto-rotation, audit log, fine-grained policies), AWS Secrets Manager / GCP Secret Manager; secret rotation strategy - Graceful shutdown: catch SIGTERM (Kubernetes sends this before SIGKILL after
terminationGracePeriodSeconds), stop accepting new connections (close listen socket or remove from load balancer), drain in-flight requests (atomic counter), close DB connection pool, deregister from service discovery, log completion; target: shutdown in < terminationGracePeriodSeconds (default 30s)
Technologies & Tools
Prometheus
Grafana
Jaeger
OpenTelemetry
HashiCorp Vault
AlertManager
/* Graceful shutdown via SIGTERM — C implementation */
#include <signal.h>
#include <stdatomic.h>
#include <stdio.h>
#include <unistd.h>
static atomic_int in_flight = 0;
static atomic_bool shutdown_req = false;
static void handle_sigterm(int sig) {
(void)sig;
atomic_store(&shutdown_req, true);
}
void register_signals(void) {
struct sigaction sa = { .sa_handler = handle_sigterm };
sigemptyset(&sa.sa_mask);
sigaction(SIGTERM, &sa, NULL);
sigaction(SIGINT, &sa, NULL); /* also handle Ctrl-C */
}
/* Called at start of each request handler */
void request_begin(void) { atomic_fetch_add(&in_flight, 1); }
/* Called at end of each request handler */
void request_end(void) { atomic_fetch_sub(&in_flight, 1); }
int main(void) {
register_signals();
/* ... start server, accept connections ... */
/* Main loop — stop accepting when shutdown requested */
while (!atomic_load(&shutdown_req)) {
/* accept() new connections */
}
/* Drain: wait for all in-flight requests to complete */
fprintf(stderr, "[shutdown] draining %d in-flight requests\n",
atomic_load(&in_flight));
while (atomic_load(&in_flight) > 0)
usleep(5000); /* poll every 5ms */
/* Close DB pools, deregister from service discovery */
fprintf(stderr, "[shutdown] clean exit\n");
return 0;
}Concept Checklist
| ✓ | # | Concept | Category |
|---|---|---|---|
| 1 | 3 pillars: logs (events), metrics (aggregates), traces (causal chains) — what each answers | Observability | |
| 2 | Structured logging: JSON lines, mandatory fields, log levels, trace_id correlation | Observability | |
| 3 | Metric types: counter (rate()), gauge, histogram (histogram_quantile p99) | Metrics | |
| 4 | RED method (Rate, Errors, Duration) and USE method (Utilization, Saturation, Errors) | Metrics | |
| 5 | Prometheus: pull-based scraping, exposition format, PromQL, AlertManager | Metrics | |
| 6 | Distributed tracing: trace/span model, W3C traceparent header, OpenTelemetry | Tracing | |
| 7 | Health checks: liveness (restart) vs readiness (remove from LB) vs startup probe | Reliability | |
| 8 | Rate limiting: token bucket, sliding window log, sliding window counter | Performance | |
| 9 | OWASP Top 10: SQL injection, SSRF, broken access control, security misconfiguration | Security | |
| 10 | Input validation: allowlist, trust boundaries, size limits, sanitization | Security | |
| 11 | Secrets management: Vault dynamic secrets, never in code/images, rotation | Security | |
| 12 | Graceful shutdown: SIGTERM handler, drain in-flight, close pools, deregister | Reliability |