SYSTEM DESIGN MASTERY · TRACK B · MODULE B11 · WEEK 21 ACID · 2PC · SAGA · OUTBOX · IDEMPOTENCY

Distributed Systems Consistency · Saga Pattern · Compensation

ACID,
Distributed
Transactions
& Saga

ISOLATION LEVELS · 2-PHASE COMMIT · SAGA PATTERN
CHOREOGRAPHY vs ORCHESTRATION · OUTBOX · IDEMPOTENCY

ACID PROPERTIES

2PC

CLASSIC SOLUTION

Saga

MODERN SOLUTION

B11

MODULE

ACID

Isolation Levels

2PC Problems

Saga Pattern

Choreography

Orchestration

Outbox Pattern

Idempotency

ACID Properties

The four guarantees of a reliable database transaction

ATOMICITY

All operations succeed together, or all fail together. Money cannot leave account A without arriving at account B.

Mechanism:
Write-ahead log (WAL)
Rollback on failure

CONSISTENCY

Every transaction takes the DB from one valid state to another. Constraints and rules are never violated during a transaction.

Mechanism:
CHECK constraints
Foreign key rules
Application logic

ISOLATION

Concurrent transactions don't see each other's in-flight changes. Each transaction sees a consistent snapshot.

Mechanism:
MVCC (Postgres)
Locking (MySQL)
See: isolation levels

DURABILITY

Committed data survives crashes. Once COMMIT returns, it's on disk. Server crash immediately after cannot lose the commit.

Mechanism:
fsync to WAL
Battery-backed cache
Replication

The one that matters most in interviews: Isolation — specifically the trade-offs between isolation levels. Most bugs in distributed systems come from incorrect assumptions about isolation, not from atomicity or durability failures.

Isolation Levels

Four anomalies, four levels — each level prevents a subset of anomalies

Four isolation anomalies — know these coldCONCURRENCY BUGS

// 1. DIRTY READ: read uncommitted data that later rolls back
T1: UPDATE balance = 50 WHERE user='alice'  (not yet committed)
T2: SELECT balance FROM users WHERE user='alice'  → 50  (dirty!)
T1: ROLLBACK  → balance never changed to 50
T2 read data that never existed.

// 2. NON-REPEATABLE READ: same row returns different values in same transaction
T1: SELECT balance FROM users WHERE user='alice'  → 100
T2: UPDATE balance = 200 WHERE user='alice'; COMMIT
T1: SELECT balance FROM users WHERE user='alice'  → 200  (changed!)

// 3. PHANTOM READ: new rows appear in a repeated range query
T1: SELECT COUNT(*) FROM orders WHERE status='pending'  → 5
T2: INSERT INTO orders (status) VALUES ('pending'); COMMIT
T1: SELECT COUNT(*) FROM orders WHERE status='pending'  → 6  (phantom!)

// 4. LOST UPDATE: two transactions overwrite each other's changes
T1: READ balance = 100
T2: READ balance = 100
T1: WRITE balance = 110  (+10)
T2: WRITE balance = 120  (+20, but only saw original 100 — lost T1's update!)
// Should be 130. T2 overwrote T1. 10 dollars vanished.

ISOLATION LEVEL	DIRTY READ	NON-REPEAT READ	PHANTOM	LOST UPDATE	DEFAULT FOR
Read Uncommitted	✗ allowed	✗ allowed	✗ allowed	✗ allowed	Rarely used
Read Committed	✓ prevented	✗ allowed	✗ allowed	✗ allowed	PostgreSQL
Repeatable Read	✓ prevented	✓ prevented	✗ allowed	✓ prevented	MySQL InnoDB
Serializable	✓ prevented	✓ prevented	✓ prevented	✓ prevented	Max consistency

The Distributed Transaction Problem

A single business operation spans multiple databases — no ACID across them

Order placement: 4 services, 4 databases, 1 business operationTHE PROBLEM

// A customer places an order. This requires:
// 1. Order Service     → INSERT order  INTO orders_db
// 2. Payment Service   → UPDATE balance IN payment_db
// 3. Inventory Service → UPDATE stock   IN inventory_db
// 4. Notification      → send email     via external SMTP

// These are FOUR SEPARATE DATABASES. There is NO single transaction spanning them.
// What happens if Step 3 fails after Steps 1 and 2 succeed?

BEGIN TRANSACTION on orders_db:
  INSERT INTO orders ...    ✓ committed
  
BEGIN TRANSACTION on payment_db:
  UPDATE balance - $100 ... ✓ committed  ← alice is charged
  
BEGIN TRANSACTION on inventory_db:
  UPDATE stock - 1 ...      ✗ FAILS  ← item out of stock!

// Alice was charged $100 but cannot receive her item.
// Order DB shows order created. Payment DB shows deduction. Inventory unchanged.
// System is in an INCONSISTENT state across services.

The fundamental issue: You cannot have a single ACID transaction that spans two separate database servers. Network partitions make it impossible to guarantee atomicity across DBs. Every distributed system must choose: 2-Phase Commit (consistency, but blocks), or Saga (eventual consistency, but available).

Two-Phase Commit (2PC)

Classic distributed transaction — correct but fragile

// 2PC HAPPY PATH

Phase 1 — PREPARE

Coordinator

──PREPARE──→

Order Service, Payment Service, Inventory Service

Each Participant

Executes transaction locally (does NOT commit). Writes PREPARE to WAL. Replies READY.

Coordinator

←──READY───

All 3 reply READY → proceed to Phase 2

Phase 2 — COMMIT

Coordinator

──COMMIT──→

All participants

Each Participant

Commits local transaction. Releases locks. Replies DONE.

// 2PC FAILURE — COORDINATOR CRASHES AFTER PREPARE

Coordinator

──PREPARE──→

All participants reply READY. Coordinator writes COMMIT decision to log.

💥 COORDINATOR CRASHES before sending COMMIT

Participants

Stuck in PREPARED state — holding locks, cannot commit OR rollback.

System

FROZEN until coordinator recovers. All participants block indefinitely.

2PC's fatal flaw: It is a blocking protocol. If the coordinator crashes after sending PREPARE but before sending COMMIT, all participants are in an "uncertain" state — they hold locks and cannot proceed without hearing from the coordinator. Recovery requires coordinator restart, which may take minutes. During that time, the system is frozen.

The Saga Pattern

Local transactions + compensating transactions — no distributed locks

// SAGA HAPPY PATH — Order placement

Order Service

Create order (status: PENDING)

→ publishes "OrderCreated"

Payment Service

Deduct $100 from alice

→ publishes "PaymentProcessed"

Inventory Service

Reserve item (stock - 1)

→ publishes "InventoryReserved"

Order Service

Update order (status: CONFIRMED)

→ publishes "OrderConfirmed"

Notification

Send confirmation email

→ done (pivot transaction)

// SAGA COMPENSATION PATH — Inventory fails at step 3

1✓

Order Service

Order created — done

2✓

Payment Service

Payment deducted — done

3✗

Inventory Service

FAILS — item out of stock

→ publishes "InventoryFailed"

Payment Service

COMPENSATE: refund $100 to alice

→ publishes "PaymentRefunded"

Order Service

COMPENSATE: cancel order (status: CANCELLED)

→ publishes "OrderCancelled"

Choreography vs Orchestration

Two ways to coordinate a Saga — choose based on complexity and observability needs

Choreography

EVENT-DRIVEN · NO COORDINATOR

Services react to events independently. Each service subscribes to relevant events, performs its action, publishes the next event. No central conductor.

        ✓ Loose coupling — services don't know each other

        ✓ No SPOF coordinator service

        ✓ Easy to add new steps

        ✗ No single view of saga state

        ✗ Hard to debug distributed event chains

        ✗ Cyclic dependencies can emerge

Orchestration ★

CENTRAL COORDINATOR · COMMAND-DRIVEN

A dedicated Saga Orchestrator commands each service step-by-step. Orchestrator tracks state, handles failures, issues compensations. One service to reason about.

        ✓ Clear saga state in one place

        ✓ Easy to debug and add observability

        ✓ Handles complex conditional flows

        ✗ Orchestrator is potential SPOF

        ✗ More coupling to orchestrator

        ✗ Can become a God service

Interview recommendation: "For simple linear flows with few services, choreography is elegant. For complex conditional flows or anything requiring strong observability (e.g., payment processing), I'd use orchestration — the ability to answer 'what state is this saga in?' is invaluable in production."

Compensating Transactions

Forward-moving corrections — not rollbacks

Key insight: Compensating transactions are NOT rollbacks. You cannot un-send an email or un-charge a credit card. Compensation is a new, forward-moving transaction that corrects the previous one (refund, cancel, unreserve).

Compensation design — every step must have a compensating stepDESIGN

// For every Saga step, design its compensation BEFORE building the step:

Step 1: Create Order        → Compensation: Cancel Order
Step 2: Deduct Payment     → Compensation: Issue Refund
Step 3: Reserve Inventory  → Compensation: Release Reservation
Step 4: Book Shipping Slot → Compensation: Cancel Booking
Step 5: Send Email         → Compensation: Send Cancellation Email
// (cannot un-send; notify user of cancellation instead)

// Pivot transaction: the step after which compensation is impossible/impractical
// Design: put pivot transaction AS LATE AS POSSIBLE in the saga
// Notifications, external API calls = typically pivot transactions

// Idempotency requirement:
// Compensation may run multiple times (network retry, at-least-once delivery)
// REFUND must be idempotent: cannot refund twice for one order
// Check: INSERT INTO refunds (order_id, amount) ON CONFLICT DO NOTHING
// Or: idempotency key = order_id → check before processing

The Outbox Pattern

Atomic DB update + event publication — without distributed transactions

The bug without outbox — dual write problemBUG

// Payment Service receives "ProcessPayment" command

// Step 1: update DB
UPDATE accounts SET balance = balance - 100 WHERE user='alice';
COMMIT;  ← succeeds

// 💥 SERVER CRASHES HERE

// Step 2: publish event
kafka.publish("PaymentProcessed", {...});  ← NEVER RUNS

// Result: alice's balance is deducted, but no "PaymentProcessed" event was published.
// The saga is stuck. Alice pays but gets nothing.
// Dual write problem: two separate systems (DB + Kafka) cannot be updated atomically.

Outbox pattern — atomic DB + event in single local transactionSOLUTION

// Within a SINGLE LOCAL DB TRANSACTION:
BEGIN TRANSACTION;
  UPDATE accounts SET balance = balance - 100 WHERE user='alice';
  INSERT INTO outbox (id, event_type, payload, sent, created_at)
  VALUES (uuid(), 'PaymentProcessed', '{"order":"123","amount":100}', false, NOW());
COMMIT;  ← both update AND outbox row are atomic

// Separate "Outbox Relay" process (runs continuously):
while (true) {
  rows = db.query("SELECT * FROM outbox WHERE sent = false ORDER BY created_at LIMIT 100")
  for (row of rows) {
    kafka.publish(row.event_type, row.payload)  // at-least-once
    db.update("UPDATE outbox SET sent=true WHERE id=?", row.id)
  }
  sleep(100ms)
}

// Guarantee: if DB committed → event row exists → relay will publish it
// Relay may publish twice (retry on crash) → consumer must be idempotent
// Alternative: CDC (Debezium) watches DB transaction log → publishes to Kafka

TABLE: outboxper-service, same DB as business tables

COLUMN TYPE PURPOSE

UUID

Unique event ID — used for idempotency on consumer side

event_type

VARCHAR

'PaymentProcessed', 'OrderCreated', etc.

payload

JSONB

Full event data to publish to Kafka

sent

BOOLEAN

false = pending publication, true = published

created_at

TIMESTAMP

Ordering — relay publishes in creation order

sent_at

TIMESTAMP

NULL until published — monitoring lag

Idempotency

Performing an operation multiple times = same result as once

Idempotency key pattern — Stripe's approachPATTERN

// Client sends payment with idempotency key
POST /payments
Headers: Idempotency-Key: "order-123-payment-attempt-1"
Body:    {"amount": 100, "currency": "USD", "user": "alice"}

// Server logic:
function processPayment(idempotencyKey, amount, user) {
  // Check if already processed
  existing = db.query("SELECT result FROM idempotency_cache WHERE key = ?", idempotencyKey)
  if (existing) return existing.result  // return same result, don't charge again

  // Process payment
  result = stripe.charge(amount, user)

  // Store result with key (within same transaction as the charge)
  db.insert("INSERT INTO idempotency_cache (key, result, expires_at) VALUES (?,?,?)",
    idempotencyKey, result, now + 24h)

  return result
}

// If client retries (network timeout, didn't receive response):
// POST /payments with same Idempotency-Key → returns cached result, no double charge

// Idempotency key design:
// order_id + step_name = "order-123-payment"  (scoped to specific operation)
// UUID per attempt = allows retry after timeout, prevents replay after success
// TTL: 24 hours (after which key expires; client should create new order)

Why idempotency is non-negotiable in Sagas: Kafka delivers at-least-once. Your outbox relay may publish the same event twice. Network retries happen. Every step in a Saga must be idempotent — running it twice must produce the same outcome as running it once. This is not optional.

Isolation Level Analysis

~1 hr

›

For each scenario: name the isolation anomaly and the minimum isolation level to prevent it.

Banking app: user checks balance while an incoming transfer is in-flight (UPDATE not yet committed)
Ticket booking: two users simultaneously book the last seat — both see 1 seat available
Report: SELECT SUM(revenue) while revenue rows are being inserted by other transactions
Two users both read a stock price, both decide to buy, both increment a counter — final count is wrong

For each: explain whether you'd use MVCC (snapshot) isolation or pessimistic locking, and why.

Design a Payment Saga

~2 hrs

›

E-commerce checkout: Order → Payment (Stripe) → Inventory → Shipping → Notification

Draw the happy path event/command sequence
Draw the compensation path when Inventory fails at step 3
Choose choreography or orchestration for this scenario — justify your choice
Design the outbox table schema for the Payment Service
Design the idempotency key for the "DeductPayment" step. What makes a good key?

III

2PC vs Saga: Money Transfer

~1.5 hrs

›

Design a bank transfer ($100 from Alice to Bob) using 2PC. What are the 4 network roundtrips?
What happens if the coordinator crashes after sending PREPARE but before COMMIT? Walk through exactly.
Design the same transfer as a Saga. What is the compensating transaction for "debit Alice"?
During the Saga, there's a brief window where Alice has been debited but Bob hasn't been credited. How do you handle this in the UI?
For money transfer specifically — do you recommend 2PC or Saga? Justify for this exact use case.

★

Full "Place Order" Saga Design

~3 hrs

›

Design the complete "Place Order" saga for an e-commerce platform. 5 services: Order, Payment, Inventory, Shipping, Notification.

Choose orchestration or choreography. Draw the full state machine for the order entity.
Design the Saga Orchestrator's state table schema (how does it track saga progress?)
Outbox + Kafka: show how Payment Service publishes events atomically with DB updates.
Idempotency throughout: design keys for each step. What happens if PaymentService receives the same "ProcessPayment" command twice?
Failure scenarios:
a) Payment network timeout (client doesn't know if charge succeeded)
b) InventoryService crashes mid-reservation
c) Orchestrator itself crashes with order in step 3
How do you make the Orchestrator itself highly available?

0 / 18 completedMODULE B11 · ACID & SAGA

ACID: all 4 properties with concrete examples

4 isolation anomalies: dirty read, non-repeatable, phantom, lost update

Isolation levels: Read Committed (PG default), Repeatable Read (MySQL default)

Distributed transaction problem: no single ACID transaction across DBs

2PC: prepare phase + commit phase — correct but blocking

2PC fatal flaw: coordinator crash after PREPARE = participants frozen

Saga pattern: local transactions + events/commands + compensation

Saga happy path and compensation path — can draw both

Choreography: event-driven, no coordinator, loose coupling

Orchestration: central coordinator, clear state, conditional logic

Compensation: forward-moving (refund), not rollback — always idempotent

Pivot transaction: point of no return — place notifications last

Outbox pattern: DB update + event in same local transaction

Outbox relay: polls outbox, publishes to Kafka, marks sent — at-least-once

Idempotency key: check before processing, store result, TTL 24h

BASE vs ACID: "ACID within services, BASE between services"

✏️ Tasks I–III: isolation analysis, payment saga, 2PC vs Saga

✏️ Task IV (capstone): full Place Order saga design

§ NEXT MODULE

B12 — System Design Interview Framework & Mock Interviews

      The 7-step framework in detail · Time allocation (45 min)

      Back-of-envelope estimation practice · 6 full mock interviews

      Common mistakes · What interviewers actually look for