Backend Engineering Roadmap

A structured, hands-on path from web fundamentals to production-grade distributed systems — with C/C++ examples, concept checklists, and interactive progress tracking.

8Phases

80Concepts

8Code Examples

∞Depth

Overall Progress

0% 0 of 80 concepts checked

Ph0

How the Web Works

Prerequisite No Prereqs 📄 M01 Notes 📄 M02 Notes

▼

0/8 checked 0%

Key Concepts

DNS resolution: recursive vs iterative queries, TTL, caching chain (browser → OS → recursive resolver → root nameserver → TLD → authoritative)
TCP three-way handshake: SYN → SYN-ACK → ACK; connection teardown FIN/FIN-ACK/ACK; RST for abrupt close
TLS 1.3 handshake: ClientHello (supported ciphers + key_share), ServerHello + Certificate + CertificateVerify, Finished; ECDHE forward secrecy
HTTP/1.1: persistent connections (Keep-Alive), pipelining, head-of-line blocking at TCP layer
HTTP/2: binary framing, multiplexing (multiple streams over single TCP), HPACK header compression, server push; still has TCP HOL blocking
HTTP/3 + QUIC: runs over UDP, built-in TLS 1.3, independent streams (no HOL blocking), 0-RTT resumption
Web server accept loop: listen socket + SO_REUSEADDR/SO_REUSEPORT, accept() blocks until client connects; thread-per-request (Apache) vs event loop (Nginx/epoll)
Backend request lifecycle: accept → parse HTTP → route to handler → middleware chain → business logic → DB query → serialize response → send

Technologies & Tools

TCP/IP TLS 1.3 HTTP/1.1 HTTP/2 HTTP/3 QUIC DNS Wireshark

/* Minimal TCP server skeleton — illustrates accept loop */
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <string.h>

int main(void) {
    int srv = socket(AF_INET, SOCK_STREAM, 0);

    int opt = 1;
    setsockopt(srv, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    struct sockaddr_in addr;
    memset(&addr, 0, sizeof(addr));
    addr.sin_family      = AF_INET;
    addr.sin_port        = htons(8080);
    addr.sin_addr.s_addr = INADDR_ANY;

    bind(srv, (struct sockaddr *)&addr, sizeof(addr));
    listen(srv, SOMAXCONN);   /* SOMAXCONN = OS backlog limit */

    while (1) {
        int client = accept(srv, NULL, NULL);  /* blocks until client connects */
        /* hand off: thread-per-request -> pthread_create()  */
        /*           event loop          -> epoll_ctl(ADD)   */
        close(client);
    }
}

    

Concept Checklist

#	Concept	Category
1	DNS resolution and caching chain (resolver → root → TLD → authoritative)	Network
2	TCP 3-way handshake and connection teardown (FIN/RST)	Network
3	TLS 1.3 handshake: ECDHE key exchange and forward secrecy	Security
4	HTTP/1.1: persistent connections, pipelining, HOL blocking	HTTP
5	HTTP/2: multiplexing, binary framing, HPACK compression	HTTP
6	HTTP/3 + QUIC: UDP-based, independent streams, 0-RTT	HTTP
7	Web server accept loop: thread-per-request vs event loop	Server
8	Backend request lifecycle: accept → route → middleware → handler → DB → respond	Server

Ph1

API Design & Contracts

Foundational Requires Ph0 📄 M03 Notes 📄 M04 Notes 📄 M05 Notes

▼

0/10 checked 0%

Key Concepts

REST principles: resources as nouns (not verbs), stateless client-server, uniform interface, cacheable responses, layered system, optional HATEOAS
URL & versioning: plural nouns (/users not /user), nested resources (/users/42/orders), versioning strategies — URI prefix (/v1/), Accept header (application/vnd.api+json;version=1), query param (?version=1)
HTTP method semantics: GET/HEAD (safe + idempotent), PUT/DELETE (idempotent, not safe), POST (neither), PATCH (partial update, should be idempotent in practice)
Status codes: 200 OK, 201 Created, 204 No Content, 301/302/304 redirects, 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 Conflict, 422 Unprocessable Entity, 429 Too Many Requests, 500/502/503/504
Request/response shaping: direct payload vs envelope ({data, meta, links}), consistent error objects, snake_case vs camelCase field naming
Pagination: offset+limit (simple, but skips/duplicates on concurrent writes), cursor-based (stable, no skips), keyset pagination (most scalable); include total_count, next_cursor in response
Error standard: RFC 7807 Problem Details — type (URI), title, status, detail, instance fields; consistent error envelope across all endpoints
OpenAPI/Swagger: spec-first design philosophy, YAML schema, $ref for reusable components, code generation for servers (stub) and clients (SDK)
gRPC: Protocol Buffers IDL (syntax = "proto3"), service + rpc definitions, unary vs client-streaming vs server-streaming vs bidirectional-streaming; when to prefer (internal services, streaming, strong typing)
GraphQL: schema-first (SDL), resolvers, queries (read) vs mutations (write) vs subscriptions (realtime), N+1 problem (DataLoader batching solution)

Technologies & Tools

REST gRPC GraphQL OpenAPI Protobuf Swagger RFC 7807

/* user.proto — gRPC service definition */
// syntax = "proto3";
//
// service UserService {
//   rpc GetUser (GetUserRequest)  returns (UserResponse);          // unary
//   rpc WatchUser (GetUserRequest) returns (stream UserResponse);  // server-streaming
// }
//
// message GetUserRequest { string user_id = 1; }
//
// message UserResponse {
//   string user_id    = 1;
//   string username   = 2;
//   string email      = 3;
//   int64  created_at = 4;   // Unix timestamp
// }

/* Minimal HTTP/1.1 response builder in C */
#include <stdio.h>
#include <string.h>
#include <unistd.h>

void send_json(int fd, int status, const char *body) {
    char header[512];
    int  body_len = (int)strlen(body);
    snprintf(header, sizeof(header),
        "HTTP/1.1 %d OK\r\n"
        "Content-Type: application/json\r\n"
        "Content-Length: %d\r\n"
        "Connection: close\r\n"
        "\r\n",
        status, body_len);
    write(fd, header, strlen(header));
    write(fd, body,   body_len);
}

    

Concept Checklist

#	Concept	Category
1	REST constraints: statelessness, uniform interface, resource naming	REST
2	URL design: plural nouns, nesting, versioning strategies (/v1/, header, query)	REST
3	HTTP method idempotency: GET/PUT/DELETE vs POST/PATCH semantics	REST
4	HTTP status code families and when to use each (2xx/3xx/4xx/5xx)	REST
5	Pagination: offset vs cursor vs keyset — tradeoffs for each	REST
6	RFC 7807 Problem Details: type, title, status, detail, instance	API Design
7	OpenAPI spec-first design and code generation workflow	API Design
8	gRPC: Protobuf IDL, service definition, 4 streaming modes	gRPC
9	When to choose gRPC over REST (internal, streaming, strong typing)	gRPC
10	GraphQL: schema, resolvers, N+1 problem and DataLoader solution	GraphQL

Ph2

Databases & Storage

Core Requires Ph1 📄 M06 Notes 📄 M07 Notes

▼

0/12 checked 0%

Key Concepts

Relational schema design: normalization (1NF removes repeating groups, 2NF removes partial dependencies, 3NF removes transitive dependencies), ERD, foreign key constraints, check constraints
Indexes: B-tree (default, ordered, range queries), hash (equality only), composite (left-prefix rule), covering (index-only scan), partial/filtered; index selectivity; write amplification tradeoff
Query plans: EXPLAIN ANALYZE (actual rows, actual time), sequential scan vs index scan vs index-only scan, join algorithms (nested loop, hash join, merge join), planner statistics
Transactions & ACID: Atomicity (all-or-nothing), Consistency (invariants preserved), Isolation (concurrent txns don't interfere), Durability (committed = persisted to WAL/disk)
Isolation levels: Read Uncommitted (dirty reads), Read Committed (default PostgreSQL), Repeatable Read (no phantom in MySQL InnoDB via MVCC), Serializable (SSI in PostgreSQL); phenomena: dirty read, non-repeatable read, phantom read
Deadlocks: detection (wait-for graph cycle), prevention (lock ordering — always acquire locks in same order), lock timeout (lock_timeout = '2s' in PostgreSQL), SKIP LOCKED for queue patterns
NoSQL taxonomy: document (MongoDB — flexible schema, nested objects), key-value (Redis — sub-ms latency), wide-column (Cassandra — write-optimized, partitioned by key), time-series (InfluxDB), graph (Neo4j); choose by access pattern
Redis data structures: string (counters, cache), hash (object fields), list (queues, stacks), set (unique members), sorted set (leaderboards, rate limiting), stream (event log); each with O() complexity
Caching patterns: cache-aside/lazy loading (app reads cache first, on miss reads DB and populates cache), read-through (cache fetches from DB), write-through (write to cache + DB sync), write-behind (async DB write)
Redis advanced: persistence modes (RDB — snapshot at intervals, AOF — append-only log, both for durability), pub/sub (fire-and-forget), Lua scripting (atomic multi-command), rate limiting with INCR+EXPIRE
Connection pooling: why (TCP + auth handshake cost per connection), Little's Law (avg connections = arrival_rate × avg_latency), pgBouncer modes (session/transaction/statement), pool exhaustion and backpressure
Database migrations: versioned sequential scripts (Flyway/Liquibase pattern), forward-only vs rollback scripts, zero-downtime techniques: expand-contract (add column nullable → backfill → add NOT NULL → drop old column)

Technologies & Tools

PostgreSQL MySQL Redis MongoDB Cassandra pgBouncer hiredis libpq

/* PostgreSQL via libpq — parameterized query (prevents SQL injection) */
#include <libpq-fe.h>
#include <stdio.h>

void fetch_user(PGconn *conn, const char *user_id) {
    const char *params[1] = { user_id };
    PGresult *res = PQexecParams(conn,
        "SELECT id, name, email FROM users WHERE id = $1",
        1,    /* nParams */
        NULL, /* paramTypes  (let server infer) */
        params, NULL, NULL,
        0     /* result format: text */
    );
    if (PQresultStatus(res) == PGRES_TUPLES_OK && PQntuples(res) > 0) {
        printf("id=%-6s  name=%-20s  email=%s\n",
            PQgetvalue(res, 0, 0),
            PQgetvalue(res, 0, 1),
            PQgetvalue(res, 0, 2));
    }
    PQclear(res);
}

/* Redis cache-aside via hiredis */
#include <hiredis/hiredis.h>
#include <string.h>

/* Returns cached JSON string or NULL (caller must free reply) */
redisReply *get_user_cached(redisContext *rc, PGconn *pg,
                             const char *user_id)
{
    char key[64];
    snprintf(key, sizeof(key), "user:%s", user_id);

    redisReply *r = redisCommand(rc, "GET %s", key);
    if (r && r->type == REDIS_REPLY_STRING)
        return r;   /* cache HIT */

    freeReplyObject(r);

    /* cache MISS — query DB, then SET with 5-min TTL */
    /* fetch_user(pg, user_id) -> serialize to JSON ->  */
    /* redisCommand(rc, "SET %s %s EX 300", key, json_val) */
    return NULL;
}

    

Concept Checklist

#	Concept	Category
1	Relational schema normalization: 1NF, 2NF, 3NF and when to denormalize	SQL
2	Index types: B-tree, hash, composite, covering, partial; left-prefix rule	SQL
3	Query plans: EXPLAIN ANALYZE, sequential scan vs index scan	SQL
4	ACID properties and what each guarantees	Transactions
5	Isolation levels: RC, RR, Serializable; dirty/phantom/non-repeatable reads	Transactions
6	Deadlocks: detection, lock ordering prevention, SKIP LOCKED	Transactions
7	NoSQL taxonomy: document, key-value, wide-column, time-series, graph — when to use	NoSQL
8	Redis data structures and time complexity of each	Redis
9	Caching patterns: cache-aside, read-through, write-through, write-behind	Caching
10	Redis persistence: RDB vs AOF; pub/sub; Lua atomicity	Redis
11	Connection pooling: Little's Law, pgBouncer modes, pool exhaustion	Performance
12	Zero-downtime migrations: expand-contract pattern	Migrations

Ph3

Authentication & Authorization

Core Requires Ph1 📄 M09 Notes

▼

0/8 checked 0%

Key Concepts

Session-based auth: server stores session state (in-memory or Redis), session ID in HttpOnly+Secure cookie, session fixation attack (regenerate session ID on login), CSRF protection (SameSite=Strict or CSRF token)
JWT structure: three base64url-encoded sections — header (alg, typ), payload (claims), signature; standard claims: iss (issuer), sub (subject), aud (audience), exp (expiry), nbf (not before), iat (issued at), jti (JWT ID for revocation)
JWT signing algorithms: HS256 (HMAC-SHA256, shared secret — symmetric, all services need secret), RS256 (RSA — private key signs, public key verifies — asymmetric, safe to distribute public key), ES256 (ECDSA, smaller keys than RSA)
Access + refresh token pattern: short-lived access token (15min–1hr, stateless validation), long-lived refresh token (7–30 days, stored in DB, one-time-use rotation, allows revocation)
OAuth2 flows: Authorization Code + PKCE (for SPAs and mobile — code verifier/challenge prevents interception), Client Credentials (machine-to-machine, no user), Device Code (CLI/TV apps — user visits URL on phone)
API Keys: generation (crypto/rand CSPRNG → hex or base62 encoding), never store plaintext (store SHA-256 hash + prefix for lookup), scoping to specific resources/operations, key rotation strategy
RBAC vs ABAC: Role-Based (user has role, role has permissions — simple, coarse-grained), Attribute-Based (policy: ALLOW if subject.dept == resource.dept AND action == "read" — flexible, complex); hybrid (RBAC for coarse, ABAC for fine-grained)
Password storage: why fast hashes are wrong (MD5/SHA256: billions/sec on GPU), bcrypt (configurable cost factor, ~100ms target), Argon2id (OWASP recommended — memory-hard, time-hard, side-channel resistant), always timing-safe comparison (constant-time memcmp)

Technologies & Tools

JWT OAuth2 OpenSSL bcrypt Argon2 Redis (sessions) PKCE

/* JWT HMAC-SHA256 signature verification (OpenSSL) */
#include <openssl/hmac.h>
#include <openssl/evp.h>
#include <string.h>
#include <stdio.h>

/* Compare two byte arrays in constant time to prevent timing attacks */
static int const_time_cmp(const unsigned char *a,
                           const unsigned char *b, size_t len) {
    unsigned char diff = 0;
    for (size_t i = 0; i < len; i++)
        diff |= a[i] ^ b[i];
    return diff == 0;
}

/* Verify HS256: header_payload = "base64url(hdr).base64url(payload)" */
int jwt_verify_hs256(const char    *header_payload,
                     const unsigned char *expected_sig, size_t sig_len,
                     const unsigned char *secret,       size_t secret_len)
{
    unsigned char digest[EVP_MAX_MD_SIZE];
    unsigned int  digest_len = 0;

    HMAC(EVP_sha256(),
         secret,   (int)secret_len,
         (const unsigned char *)header_payload, strlen(header_payload),
         digest,  &digest_len);

    if (digest_len != sig_len) return 0;
    return const_time_cmp(digest, expected_sig, digest_len);
}

    

Concept Checklist

#	Concept	Category
1	Session-based auth: HttpOnly cookie, Redis-backed sessions, session fixation	Auth
2	JWT structure: header.payload.signature, standard claims (iss, sub, exp, jti)	JWT
3	JWT signing: HS256 vs RS256 vs ES256 — symmetric vs asymmetric tradeoffs	JWT
4	Access + refresh token pattern: rotation, revocation, short-lived access tokens	JWT
5	OAuth2 flows: Authorization Code + PKCE, Client Credentials, Device Code	OAuth2
6	API Keys: CSPRNG generation, hashing at rest, scoping, rotation	API Security
7	RBAC vs ABAC: coarse-grained roles vs attribute-based policy evaluation	Authorization
8	Password storage: bcrypt cost factor, Argon2id memory-hardness, timing-safe compare	Security

Ph4

Concurrency & Performance

Intermediate Requires Ph0, Ph2 📄 M11 Notes

▼

0/10 checked 0%

Key Concepts

Threading models: thread-per-request (simple, high memory — 8KB stack × 10K = 80MB+), thread pool with bounded queue (Apache worker MPM), event loop + I/O multiplexing (Nginx, Node.js), green threads/goroutines (M:N userspace scheduling)
Synchronization primitives: mutex (exclusive lock, binary), RW lock (multiple concurrent readers OR single writer), semaphore (counting lock, rate limiting), condition variable (wait for predicate — always pair with mutex), spinlock (busy-wait, only for very short critical sections on multi-core)
Lock-free programming: compare-and-swap (CAS) — atomically: if (*ptr == expected) { *ptr = desired; return true; }, ABA problem (use versioned pointers), GCC __atomic builtins (__atomic_compare_exchange_n), C11 stdatomic.h
I/O multiplexing evolution: select (FD_SET bitmap, 1024 fd limit), poll (no fd limit, linear scan), epoll (Linux — O(1) notification, edge-triggered ET vs level-triggered LT, epoll_create1/epoll_ctl/epoll_wait), io_uring (Linux 5.1+ — async submit+complete ring buffers, zero-copy, no syscall per I/O)
C10K problem: 10,000 concurrent connections — why thread-per-request fails (OS scheduling overhead, stack memory), how epoll event loop solves it (single thread handles thousands of FDs)
In-process caching: LRU eviction (doubly-linked list + hash map = O(1) get/put), LFU (min-heap of frequency buckets), cache capacity planning (hot data << cold data)
Distributed caching with Redis: cache stampede (thundering herd when TTL expires simultaneously — mutex lock, probabilistic early expiry, background refresh), hotspot key sharding, client-side consistent hashing
Connection pool management: pool exhaustion (queue vs reject vs timeout), health checks on idle connections (keepalive probe or validation query), backpressure signals upstream
Load balancing algorithms: round-robin (uniform distribution), weighted round-robin (heterogeneous backends), least-connections (for variable request durations), IP hash (session stickiness, avoid with horizontal scaling), consistent hashing (minimal key redistribution when nodes added/removed)
Horizontal scaling design: stateless services (no server-side session), shared-nothing architecture, externalizing state (Redis, DB), idempotent operations (safe to retry), eventual consistency tradeoffs

Technologies & Tools

epoll io_uring pthreads stdatomic.h Redis pgBouncer

/* epoll edge-triggered event loop skeleton (Linux) */
#include <sys/epoll.h>
#include <sys/socket.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

#define MAX_EVENTS 128

static void set_nonblocking(int fd) {
    int flags = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}

void run_event_loop(int listen_fd) {
    int epfd = epoll_create1(EPOLL_CLOEXEC);

    struct epoll_event ev;
    ev.events  = EPOLLIN;
    ev.data.fd = listen_fd;
    epoll_ctl(epfd, EPOLL_CTL_ADD, listen_fd, &ev);

    struct epoll_event events[MAX_EVENTS];

    while (1) {
        int n = epoll_wait(epfd, events, MAX_EVENTS, -1 /* block forever */);

        for (int i = 0; i < n; i++) {
            if (events[i].data.fd == listen_fd) {
                /* New connection */
                int client = accept(listen_fd, NULL, NULL);
                set_nonblocking(client);

                ev.events  = EPOLLIN | EPOLLET;  /* edge-triggered */
                ev.data.fd = client;
                epoll_ctl(epfd, EPOLL_CTL_ADD, client, &ev);
            } else {
                /* Data available — handle_client(events[i].data.fd) */
                /* With ET: must read until EAGAIN */
            }
        }
    }
}

    

Concept Checklist

#	Concept	Category
1	Threading models: thread-per-request vs thread pool vs event loop vs green threads	Concurrency
2	Mutex, RW lock, semaphore, condition variable — when to use each	Concurrency
3	Lock-free CAS: compare-and-swap, ABA problem, GCC __atomic builtins	Concurrency
4	I/O multiplexing: select → poll → epoll (ET vs LT) → io_uring evolution	I/O
5	C10K problem: why threads fail at scale, how epoll solves it	I/O
6	In-process caching: LRU (linked list + hash map), LFU, eviction policies	Caching
7	Cache stampede: thundering herd, mutex lock, probabilistic early expiry	Caching
8	Connection pool exhaustion: queue vs reject, backpressure, health checks	Performance
9	Load balancing algorithms: round-robin, least-conn, consistent hashing	Scaling
10	Stateless design: externalizing state for horizontal scaling	Scaling

Ph5

Event-Driven Architecture

Intermediate Requires Ph2, Ph4 📄 M13 Notes

▼

0/10 checked 0%

Key Concepts

Why event-driven: temporal decoupling (producer/consumer run independently), fanout (one event → many consumers), audit log (full history replayable), reduces synchronous blocking chains, enables eventual consistency
Message queues vs event streams: RabbitMQ (work queue model — message consumed and deleted, at-most-once or at-least-once via acks, competing consumers, dead-letter exchange) vs Kafka (persistent log — messages retained, consumer groups replay from offset, unlimited retention)
Kafka internals: topic partitioned across brokers, each partition is an ordered immutable log; leader partition + replicas (In-Sync Replicas ISR); producer assigns partition (key hash or round-robin); consumer group — each partition consumed by exactly one consumer in group; offset committed by consumer
Kafka delivery semantics: at-most-once (acks=0, fire-and-forget), at-least-once (acks=all + retry — may duplicate), exactly-once (idempotent producer + transactions — enable.idempotence=true + transactional.id)
RabbitMQ patterns: direct exchange (routing key match), topic exchange (routing key pattern *.error), fanout exchange (broadcast to all bound queues), headers exchange; dead-letter exchange (DLX) for failed messages; message TTL; priority queues
Saga pattern: managing distributed transactions without 2PC; orchestration (central Saga Orchestrator sends commands, receives events, handles compensations), choreography (each service reacts to events and emits new events); compensating transactions roll back completed steps
Outbox pattern: write event to outbox table in same DB transaction as business data (atomicity), separate Relay/CDC process polls outbox and publishes to broker, mark as published; prevents lost events on crash between DB write and broker publish
CQRS (Command Query Responsibility Segregation): write side (commands mutate state, normalized DB optimized for writes), read side (queries return projections, denormalized read model optimized for reads); sync via domain events or CDC; eventual consistency between models
Event Sourcing: system state = ordered log of immutable domain events (not current state snapshot); reconstruct any past state by replaying events; snapshots for performance (don't replay full history); projections for derived read models; event schema versioning challenge
Idempotent consumers: natural idempotency (PUT/DELETE — repeated calls have same effect), deduplication table (store processed event IDs, reject duplicates), atomic check-and-process with DB transaction; combine with outbox for exactly-once end-to-end

Technologies & Tools

Kafka RabbitMQ librdkafka Apache Pulsar NATS

/* Kafka producer using librdkafka (C client) */
#include <librdkafka/rdkafka.h>
#include <string.h>
#include <stdio.h>

static void delivery_cb(rd_kafka_t *rk, const rd_kafka_message_t *msg,
                         void *opaque) {
    (void)rk; (void)opaque;
    if (msg->err)
        fprintf(stderr, "Delivery failed: %s\n",
                rd_kafka_err2str(msg->err));
}

void produce_event(const char *brokers, const char *topic,
                   const char *key,    const char *value) {
    char errstr[512];

    rd_kafka_conf_t *conf = rd_kafka_conf_new();
    rd_kafka_conf_set(conf, "bootstrap.servers", brokers,
                      errstr, sizeof(errstr));
    rd_kafka_conf_set_dr_msg_cb(conf, delivery_cb);

    rd_kafka_t *rk = rd_kafka_new(RD_KAFKA_PRODUCER, conf,
                                  errstr, sizeof(errstr));
    rd_kafka_topic_t *rkt = rd_kafka_topic_new(rk, topic, NULL);

retry:
    if (rd_kafka_produce(rkt,
            RD_KAFKA_PARTITION_UA,    /* auto-select partition by key hash */
            RD_KAFKA_MSG_F_COPY,      /* copy payload into rdkafka */
            (void *)value, strlen(value),
            key, strlen(key),
            NULL) == -1) {
        if (rd_kafka_last_error() == RD_KAFKA_RESP_ERR__QUEUE_FULL) {
            rd_kafka_poll(rk, 100);   /* drain delivery queue */
            goto retry;
        }
    }

    rd_kafka_flush(rk, 10000);        /* wait up to 10s for delivery */
    rd_kafka_topic_destroy(rkt);
    rd_kafka_destroy(rk);
}

    

Concept Checklist

#	Concept	Category
1	Why events: temporal decoupling, audit log, fanout, replay capability	Architecture
2	Message queue vs event stream: RabbitMQ (work queue, delete on consume) vs Kafka (log, retain+replay)	Architecture
3	Kafka internals: topics, partitions, offsets, ISR, consumer group rebalancing	Kafka
4	Delivery semantics: at-most-once, at-least-once, exactly-once (idempotent producer)	Kafka
5	RabbitMQ patterns: exchange types, DLX, message TTL	Events
6	Saga pattern: orchestration vs choreography, compensating transactions	Patterns
7	Outbox pattern: atomic write to outbox table, relay publishes to broker	Patterns
8	CQRS: separate write model (commands) from read model (projections)	Patterns
9	Event Sourcing: state as event log, snapshots, projections, schema versioning	Patterns
10	Idempotent consumers: dedup table, atomic check-and-process	Patterns

Ph6

Microservices & Infrastructure

Advanced Requires Ph3, Ph5 📄 M15 Notes

▼

0/10 checked 0%

Key Concepts

Monolith vs microservices decision: start with modular monolith, split when team topology demands it (Conway's Law), bounded contexts (DDD) define service boundaries; microservices add operational complexity — don't split prematurely
Strangler Fig pattern: incrementally replace monolith — route specific URL paths to new service at API gateway, coexist with monolith during migration, deprecate monolith module by module; avoids big-bang rewrite risk
Inter-service communication strategy: sync REST/gRPC (simple, tight coupling, propagates latency) vs async events (loose coupling, eventual consistency, harder to debug); request-reply over async via correlation ID in message header
API Gateway responsibilities: single entry point, path-based routing to backend services, authentication/authorization offload (validate JWT before forwarding), rate limiting and throttling, SSL termination, request aggregation (backend for frontend pattern), canary traffic splitting
Service discovery: client-side (service queries registry like Consul/Eureka, client chooses instance — more control), server-side (load balancer queries registry — simpler client), DNS-based (Kubernetes Services use kube-dns)
Circuit breaker: closed state (normal, count failures), open state (fail fast immediately — no calls to unhealthy service, prevents cascade), half-open state (allow probe requests to test recovery); bulkhead pattern (isolate resource pools per service)
Docker best practices for C/C++: multi-stage build (Stage 1: gcc:13 builder compiles binary, Stage 2: debian:slim runtime copies binary — minimal image size), non-root user (useradd -r), .dockerignore (exclude build artifacts), pin base image versions, ENTRYPOINT vs CMD
Kubernetes fundamentals: Pod (smallest deployable unit, co-located containers), Deployment (manages ReplicaSet, rolling updates, rollback), Service (stable DNS name + ClusterIP load balancing), Ingress (HTTP/S routing + TLS termination), ConfigMap (non-secret config), Secret (base64-encoded credentials), liveness probe (restart if unhealthy), readiness probe (remove from Service endpoints if not ready)
CI/CD pipeline stages: lint → unit test → integration test → build OCI image → push to registry → deploy to staging → smoke test → deploy to production; blue-green (two identical environments, instant cutover); canary (route 5% → 20% → 100% traffic to new version)
12-Factor App: I-Codebase (one repo, many deploys), II-Dependencies (explicitly declared), III-Config (env vars, not hardcoded), IV-Backing services (attached resources, swap without code change), V-Build/release/run (strict separation), VI-Processes (stateless, share nothing), VII-Port binding, VIII-Concurrency (scale out via process model), IX-Disposability (fast startup, graceful shutdown), X-Dev/prod parity, XI-Logs (stdout, not files), XII-Admin processes

Technologies & Tools

Docker Kubernetes Consul Nginx Helm GitHub Actions ArgoCD

# -- Stage 1: Build (fat image with full toolchain) --
FROM gcc:13 AS builder
WORKDIR /src

# Copy source and build system first (layer caching)
COPY Makefile ./
COPY src/     ./src/

# Build release binary (strip debug symbols)
RUN make release CFLAGS="-O2 -DNDEBUG" && strip bin/server

# -- Stage 2: Minimal runtime --
FROM debian:bookworm-slim
RUN apt-get update \
 && apt-get install -y --no-install-recommends libpq5 ca-certificates \
 && rm -rf /var/lib/apt/lists/*

# Non-root user for security
RUN useradd -r -u 1001 -s /sbin/nologin appuser
WORKDIR /app
COPY --from=builder /src/bin/server .
USER appuser

EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8080/health || exit 1

ENTRYPOINT ["./server"]

    

Concept Checklist

#	Concept	Category
1	Monolith vs microservices: Conway's Law, bounded contexts, modular monolith first	Architecture
2	Strangler Fig: incremental migration via API gateway routing	Architecture
3	Sync vs async inter-service communication: tradeoffs, correlation ID pattern	Communication
4	API Gateway: routing, auth offload, rate limiting, BFF pattern	Infra
5	Service discovery: client-side (Consul) vs server-side vs DNS-based (K8s)	Infra
6	Circuit breaker: closed/open/half-open states, bulkhead pattern	Reliability
7	Docker multi-stage build for C/C++: builder → slim runtime, non-root user	Docker
8	Kubernetes: Pod, Deployment, Service, Ingress, liveness vs readiness probe	K8s
9	CI/CD: pipeline stages, blue-green deployment, canary traffic splitting	CI/CD
10	12-Factor App: config via env, stateless processes, stdout logs	Best Practices

Ph7

Observability & Hardening

Production Requires Ph6 📄 M17 Notes

▼

0/12 checked 0%

Key Concepts

3 pillars of observability: logs (discrete events — what happened), metrics (aggregated numeric data — how many/how fast), traces (causal chains across services — why it was slow); each answers different questions; together give full system visibility
Structured logging: JSON lines format (one JSON object per line), mandatory fields (timestamp ISO-8601, level, service, trace_id, span_id, message), log levels (DEBUG verbose, INFO normal, WARN degraded, ERROR unexpected failure, FATAL unrecoverable), never log secrets or PII, use correlation/trace IDs to link logs across services
Metrics types: counter (monotonically increasing, e.g., http_requests_total — use rate()), gauge (point-in-time value, e.g., memory_usage_bytes, active_connections), histogram (bucketed distribution, e.g., request_duration_seconds — use histogram_quantile() for p99)
RED method (for services): Rate (requests/second), Errors (error rate %), Duration (latency percentiles p50/p95/p99); USE method (for resources): Utilization (% time busy), Saturation (queue depth, wait time), Errors (device error rate)
Prometheus: pull-based scraping (Prometheus polls /metrics endpoint on services), exposition format (# HELP, # TYPE, metric_name{labels} value timestamp), PromQL (rate(http_requests_total[5m]), histogram_quantile(0.99, ...), by(service)), AlertManager for alerting rules and routing
Distributed tracing: trace (end-to-end request chain, unique trace_id), span (single operation within trace, span_id + parent_span_id), W3C traceparent header for cross-service propagation, OpenTelemetry SDK (language-agnostic instrumentation, OTLP export to Jaeger/Tempo/Zipkin)
Health check endpoints: GET /health/live — liveness probe (is process alive? if fails, Kubernetes restarts container), GET /health/ready — readiness probe (is service ready to serve traffic? if fails, removed from Service endpoints); startup probe for slow-starting containers
Rate limiting algorithms: token bucket (bucket refills at rate r, allow bursts up to capacity b — bursty traffic ok), sliding window log (store timestamps of all requests, exact but memory O(requests)), sliding window counter (approximate, memory O(1), compromise); implement at API gateway (global) and per-service (defense in depth)
OWASP Top 10 for backends: SQL injection (parameterized queries only — never string concat), command injection (avoid shell=True / system(), use execv), SSRF (Server-Side Request Forgery — allowlist outbound URLs), broken access control (check authorization on every request, not just auth), security misconfiguration (disable debug endpoints in prod, no default credentials), insecure deserialization (validate and sanitize all deserialized input)
Input validation: allowlist over denylist (define what is allowed, reject everything else), validate at trust boundaries only (never trust client input), size limits (prevent DoS via large payloads — max body size), type checking, sanitize before SQL/shell/HTML context
Secrets management: never in source code or Docker images (scan with truffleHog/gitleaks), environment variables (basic, visible in /proc/PID/environ — acceptable for containers), HashiCorp Vault (dynamic secrets with TTL + auto-rotation, audit log, fine-grained policies), AWS Secrets Manager / GCP Secret Manager; secret rotation strategy
Graceful shutdown: catch SIGTERM (Kubernetes sends this before SIGKILL after terminationGracePeriodSeconds), stop accepting new connections (close listen socket or remove from load balancer), drain in-flight requests (atomic counter), close DB connection pool, deregister from service discovery, log completion; target: shutdown in < terminationGracePeriodSeconds (default 30s)

Technologies & Tools

Prometheus Grafana Jaeger OpenTelemetry HashiCorp Vault AlertManager

/* Graceful shutdown via SIGTERM — C implementation */
#include <signal.h>
#include <stdatomic.h>
#include <stdio.h>
#include <unistd.h>

static atomic_int  in_flight        = 0;
static atomic_bool shutdown_req     = false;

static void handle_sigterm(int sig) {
    (void)sig;
    atomic_store(&shutdown_req, true);
}

void register_signals(void) {
    struct sigaction sa = { .sa_handler = handle_sigterm };
    sigemptyset(&sa.sa_mask);
    sigaction(SIGTERM, &sa, NULL);
    sigaction(SIGINT,  &sa, NULL);   /* also handle Ctrl-C */
}

/* Called at start of each request handler */
void request_begin(void) { atomic_fetch_add(&in_flight, 1); }

/* Called at end of each request handler */
void request_end(void)   { atomic_fetch_sub(&in_flight, 1); }

int main(void) {
    register_signals();
    /* ... start server, accept connections ... */

    /* Main loop — stop accepting when shutdown requested */
    while (!atomic_load(&shutdown_req)) {
        /* accept() new connections */
    }

    /* Drain: wait for all in-flight requests to complete */
    fprintf(stderr, "[shutdown] draining %d in-flight requests\n",
            atomic_load(&in_flight));
    while (atomic_load(&in_flight) > 0)
        usleep(5000);   /* poll every 5ms */

    /* Close DB pools, deregister from service discovery */
    fprintf(stderr, "[shutdown] clean exit\n");
    return 0;
}

    

Concept Checklist

#	Concept	Category
1	3 pillars: logs (events), metrics (aggregates), traces (causal chains) — what each answers	Observability
2	Structured logging: JSON lines, mandatory fields, log levels, trace_id correlation	Observability
3	Metric types: counter (rate()), gauge, histogram (histogram_quantile p99)	Metrics
4	RED method (Rate, Errors, Duration) and USE method (Utilization, Saturation, Errors)	Metrics
5	Prometheus: pull-based scraping, exposition format, PromQL, AlertManager	Metrics
6	Distributed tracing: trace/span model, W3C traceparent header, OpenTelemetry	Tracing
7	Health checks: liveness (restart) vs readiness (remove from LB) vs startup probe	Reliability
8	Rate limiting: token bucket, sliding window log, sliding window counter	Performance
9	OWASP Top 10: SQL injection, SSRF, broken access control, security misconfiguration	Security
10	Input validation: allowlist, trust boundaries, size limits, sanitization	Security
11	Secrets management: Vault dynamic secrets, never in code/images, rotation	Security
12	Graceful shutdown: SIGTERM handler, drain in-flight, close pools, deregister	Reliability