Module 11 — Multi-lcore RX/TX Pipeline

Reference code — requires DPDK installed.

What you learn

How to wire together EAL (Module 08), mempool (Module 09), and port init (Module 10) into a complete multi-lcore packet processing pipeline — the skeleton of the DP application at runtime.

This is the structural blueprint: RX lcore polls the NIC, distributes packets to worker lcores via rte_ring, workers parse + apply policy, forward packets to a shared TX ring, and the TX lcore drains it to the NIC.


Pipeline topology

NIC port 0
  │  (DMA fills mbufs from pool — Module 09)
  ↓
RX lcore (lcore 1)
  rte_eth_rx_burst(port=0, queue=0, mbufs, BURST_SIZE=32)
  │
  │  Round-robin distribute across worker rx_rings
  │
  ├── rx_ring[0] ──► Worker lcore 2
  ├── rx_ring[1] ──► Worker lcore 3
  ├── rx_ring[2] ──► Worker lcore 4
  └── rx_ring[3] ──► Worker lcore 5
                          │
           For each mbuf: │
             parse ETH/IP/UDP-TCP          (Module 05)
             DNS: dns_parse_message()       (Module 06)
             TLS: tls_extract_sni()         (Module 07)
             policy lookup                  (Module 17)
               ALLOW    → tx_ring
               DROP     → rte_pktmbuf_free()
               SINKHOLE → modify in-place → tx_ring  (Module 18)
                          │
                       tx_ring (shared MPSC)
                          │
TX lcore (lcore 1)
  rte_ring_dequeue_burst(tx_ring, mbufs, BURST_SIZE)
  rte_eth_tx_burst(port=0, queue=0, mbufs, nb)
  free unsent mbufs
  │
  ↓
NIC port 0 → wire

Files

File Purpose
pipeline.c Full pipeline: RX/worker/TX lcore functions, ring setup, stats, shutdown
Makefile DPDK build

Key concepts in the code

1. Each lcore has exactly one job

RX lcore:     ONLY calls rte_eth_rx_burst() + rte_ring_enqueue()
Worker lcore: ONLY does parse + policy + ring enqueue/dequeue
TX lcore:     ONLY calls rte_ring_dequeue() + rte_eth_tx_burst()

If the RX lcore also runs policy logic, it slows down and the NIC RX descriptor ring fills → stats.imissed increments → silent packet loss.

2. rte_pause() — never sleep()

if (nb_rx == 0) {
    rte_pause();   /* x86 PAUSE: hints CPU it's in a spin loop */
    continue;
}

sleep() or usleep() yield the thread to the OS scheduler. The lcore may not resume for milliseconds — all in-flight NIC descriptors fill up.

3. SPSC ring for RX→worker, MPSC for worker→TX

/* rx_ring: single producer (RX lcore), single consumer (one worker) */
rx_rings[i] = rte_ring_create(name, RING_SIZE, socket,
                               RING_F_SP_ENQ | RING_F_SC_DEQ);

/* tx_ring: multiple producers (all workers), single consumer (TX lcore) */
tx_ring = rte_ring_create("tx_ring", RING_SIZE, socket, 0); /* MPSC default */

SPSC is faster than MPMC (no CAS atomic needed — just head/tail with memory barriers). Use the most restrictive type you can justify.

4. unlikely() — branch prediction hints

if (unlikely(mbuf->nb_segs > 1))    /* rarely true at MTU=1500 */
    return DECISION_DROP;

if (unlikely(rte_ring_enqueue(...) != 0))  /* rarely true if ring sized correctly */
    rte_pktmbuf_free(m);

unlikely(x) expands to __builtin_expect((x), 0). On the hot path (millions of packets/sec), branch mispredictions cost ~15 cycles each.

5. Free unsent TX mbufs — the leak you’ll hit

uint16_t nb_tx = rte_eth_tx_burst(port, queue, mbufs, nb);

/* MANDATORY: free what the NIC didn't send */
for (uint16_t i = nb_tx; i < nb; i++)
    rte_pktmbuf_free(mbufs[i]);

rte_eth_tx_burst() may return nb_tx < nb when the NIC TX descriptor ring is full. The NIC only frees mbufs it actually DMA’d. If you don’t free the unsent ones, the pool slowly drains to zero — silent stall.

This is one of the most common bugs in new DPDK code.

6. Shutdown ordering — drain the rings

Stop workers → wait for workers to exit →
Stop TX lcore (TX drains tx_ring before exiting) →
wait for TX →
rte_eth_dev_stop() → rte_eal_cleanup()

Never stop the TX lcore before workers. Workers may enqueue a final burst to tx_ring after the stop signal. If TX exits first, those packets are stuck in the ring — the client’s sinkhole response is never delivered.


Next module

Module 12 — rte_hash CRUD: Deep dive into DPDK’s rte_hash table operations — the exact API used for domain_details_table in each group_struct.


Source files

File Download
pipeline.c pipeline.c
Makefile Makefile