VPP MASTERY · PHASE 2A · WEEKS 4–5
🧱 vppinfra - Core Library
vec · pool · bihash · clib_mem · format/unformat · ring buffers · timers
src/vppinfra/ C macros pool.h bihash_8_8.h vec.h

WHY VPPINFRA EXISTS

📚

vppinfra is VPP's Standard Library

FOUNDATION

vppinfra replaces the C standard library for VPP's dataplane. It provides memory management, data structures, I/O formatting, and timers that are specifically designed for the demands of high-performance packet processing: deterministic allocation, cache-line awareness, zero-copy design, and macro-heavy APIs for maximum inlining.

Every plugin and node you write will use vppinfra primitives. Understanding these data structures deeply - not just their API but their memory layout - is what separates engineers who write correct VPP code from those who write fast, correct VPP code.

vec - Dynamic Array · vec.h

Heap-allocated array with a hidden header storing length and capacity. The pointer points to element 0, not the header - fully compatible with C array indexing. Use everywhere you'd use std::vector or realloc-based arrays.

vec_add1 · vec_add · vec_len · vec_free · vec_foreach

pool - Fixed-Size Object Allocator · pool.h

Pre-allocated array of fixed-size objects with a free-list bitmap. O(1) alloc/free. Objects are addressed by index - never store pointers to pool elements (pool can realloc). This is how sessions, interfaces, and FIB entries are stored.

pool_get · pool_put · pool_elt_at_index · pool_foreach

bihash - Bounded-Index Hash · bihash_8_8.h

Two-level hash table with bounded worst-case lookup: a bucket array (L1) pointing to fixed-size pages (L2) of key-value pairs. Designed for concurrent read with a single writer. Used in NAT, ACL, FIB, session tables - essentially every fast-path lookup.

BV(clib_bihash_add_del) · BV(clib_bihash_search)

clib_mem - Memory Management · mem.h

Wrapper around dlmalloc with NUMA awareness and heap introspection. Supports multiple heaps (main heap, per-NUMA heaps). clib_mem_alloc_aligned guarantees cache-line alignment. Never use malloc/free in VPP code.

clib_mem_alloc · clib_mem_free · clib_mem_alloc_aligned

format/unformat · format.h

Extensible printf/scanf replacement. format returns a u8 * vec (not a fixed buffer). Custom format functions registered via %U. All VPP CLI output and packet trace use this - learn it to write readable trace functions.

format(0, "%U", format_ip4_address, &addr)

clib_fifo / clib_ring · fifo.h

FIFO and ring-buffer primitives built on top of vec. Used for work queues, event rings, and inter-thread communication at the framework level. Power-of-2 sized for fast modulo via bitmask.

clib_fifo_add1 · clib_fifo_sub1 · clib_ring_new

⚙️ DPDK PARALLEL - Data Structure Mapping
  • vecrte_malloc + manual realloc tracking - but with automatic growth and a type-safe foreach macro
  • poolrte_mempool for fixed-size objects - but pool objects stay in-place (no dequeue/enqueue), addressed by index, not pointer. VPP sessions are stored in pools exactly like DPDK mbufs in a mempool
  • bihash ≈ a hash table you'd build over rte_hash - but bihash is specifically designed for read-mostly concurrent access without locking on the read path
  • clib_memrte_malloc - both support NUMA-local allocation and cache-line alignment. VPP uses clib_mem everywhere; never mix with rte_malloc inside VPP

VEC - DYNAMIC ARRAY (src/vppinfra/vec.h)

📐

Memory Layout - The Hidden Header

INTERNALS

The key to understanding vec is its memory layout. The header lives before the data in memory, so the pointer you hold points directly to element[0]. This makes vec transparent to any C code expecting a plain array.

/* Memory layout of a vec_t */
+──────────────────────────────────────────────────────+
|  vec_header_t   |  element[0]  |  element[1]  | ...  |
|  len   (u32)    |              |              |       |
|  dlmalloc_hdr   |              |              |       |
+──────────────────────────────────────────────────────+
                  ↑
            your pointer lives here

/* The pointer IS the array - C-array compatible */
u32 *my_vec = 0;          /* NULL == empty vec, NOT uninitialised */
vec_add1(my_vec, 42);     /* grows by 1, may realloc */
vec_add1(my_vec, 99);
/* my_vec[0] == 42, my_vec[1] == 99 - plain array access */
u32 len = vec_len(my_vec); /* == 2 */
  • NULL is a valid empty vec - always initialise to 0, never to an uninitialised pointer
  • Never hold pointers to vec elements - vec_add1 may realloc, invalidating all element addresses. Hold indices instead
  • vec_free does not NULL the pointer - use vec_free(v); v = 0; to be safe
🛠️

Complete vec API Reference

API
Function / MacroSignature / UsageNotes
vec_add1(v, e)Append single element e to vec vMay realloc. v updated in-place (macro takes address)
vec_add(v, p, n)Append n elements from array pBulk append - faster than N × vec_add1
vec_add2(v, p, n)Reserve space for n elements, return pointer to firstUse when you want to write directly into the vec
vec_len(v)Returns u32 element count. Returns 0 for NULL vecSafe to call on NULL - does not crash
vec_bytes(v)Returns total byte size of vec data regionvec_len(v) * sizeof(v[0])
vec_free(v)Free vec memoryDoes NOT set v=0. Do that manually
vec_reset_length(v)Set length to 0 without freeing memoryReuse allocation - faster than free+realloc
vec_foreach(var,v)Iterate: vec_foreach(ep, entries) { ... }var is a pointer to each element
vec_foreach_index(i,v)Iterate by index: i goes 0..vec_len(v)-1When you need the index inside the loop
vec_dup(v)Return a copy of the vecHeap-allocates a new vec with same content
vec_validate(v, i)Ensure vec is at least i+1 elements, zero-filling new slotsUse to grow to a known index safely
vec_validate_init_empty(v,i,val)Like vec_validate but fills with val instead of 0Useful for flag arrays initialised to ~0
vec_insert(v,n,i)Insert n zero elements at position iO(n) - shifts elements right
vec_del1(v,i)Delete element at i, replacing with last elementO(1) - order NOT preserved
_vec_len(v)Direct header field access - no NULL checkUse only when v is guaranteed non-NULL
vec_set_len(v,n)Force-set length fieldAdvanced: use after manual direct writes to vec memory
/* Typical plugin usage: building a list of sw_if_index values */
u32 *sw_if_indices = 0;   /* NULL = empty vec */

/* Collect all interfaces matching a condition */
pool_foreach(hw, im->hw_interfaces) {
    if (hw->flags & VNET_HW_INTERFACE_FLAG_LINK_UP)
        vec_add1(sw_if_indices, hw->sw_if_index);
}

/* Process them */
u32 *si;
vec_foreach(si, sw_if_indices) {
    vnet_sw_interface_t *swif = vnet_get_sw_interface(vnm, *si);
    /* ... do something with swif */
}

/* Reuse without realloc */
vec_reset_length(sw_if_indices);

/* Or free completely */
vec_free(sw_if_indices);
sw_if_indices = 0;

⚠️ Critical pitfall - storing pointers to vec elements: Any operation that can grow the vec (vec_add1, vec_add, vec_validate) may call realloc, which moves the entire array to a new address. Any pointer you saved to an element is now a dangling pointer. Always store the index into the vec, not a pointer to the element.

POOL - FIXED-SIZE OBJECT ALLOCATOR (src/vppinfra/pool.h)

🏊

Pool Memory Layout and Design

INTERNALS

A pool is a pre-allocated contiguous array of fixed-size objects. It maintains a free-list as a bitmap of free slots. Allocation (pool_get) finds the first free bit and marks it used - O(1). Free (pool_put) marks the slot free again - O(1). Crucially, object addresses are stable as long as the pool does not grow - the pool never moves existing elements on alloc.

/* Pool memory model */
pool = [  obj[0]  |  obj[1]  |  obj[2]  |  obj[3]  | ... ]
         (in use)    (FREE)     (in use)    (FREE)

free_bitmap = 0b...1010   (bits set = free slots)

/* pool_get: find lowest set bit, clear it, return pointer */
/* pool_put: set the bit at this index                     */

/* Declaration */
typedef struct {
    u32  conn_id;
    u32  sw_if_index;
    ip4_address_t src, dst;
    u8   state;
} my_session_t;

my_session_t *session_pool = 0;   /* NULL = empty pool */
🛠️

Complete pool API

API
Macro / FunctionUsageReturns / Effect
pool_get(P, E)Allocate one element from pool P, set pointer EE points to newly allocated element (zero-filled)
pool_get_aligned(P, E, align)pool_get with alignment guaranteeUse for SIMD-aligned structs
pool_put(P, E)Return element pointed to by E back to poolMarks slot free. E still points to valid memory until next pool_get
pool_put_index(P, i)Free by index rather than pointerEquivalent to pool_put(P, pool_elt_at_index(P,i))
pool_elt_at_index(P, i)Return pointer to element at index iNo bounds check - undefined if i is free
pool_is_free_index(P, i)Return 1 if slot i is freeAlways check before dereferencing by index
pool_elts(P)Count of currently in-use elementsReturns u32
pool_len(P)Capacity - total allocated slots (used + free)Returns u32
pool_foreach(E, P)Iterate over all in-use elementsE is a pointer to each live element
pool_foreach_index(i, P)Iterate by index over all in-use elementsi is the index of each live element
pool_free(P)Free the entire pool memoryFrees backing store, does not set P=0
pool_validate_index(P, i)Assert that index i is valid (not free)Debug helper - crashes on invalid access
pool_alloc(P, n)Pre-allocate pool capacity for n objectsAvoids repeated realloc during initial population
/* Complete example: per-flow session pool */
my_session_t *sessions = 0;        /* pool */
uword *session_by_key = 0;         /* hash: key → pool index */

/* Create a session */
my_session_t *s;
pool_get_zero(sessions, s);        /* allocate + zero-fill */
u32 session_index = s - sessions;  /* derive index from pointer arithmetic */
s->conn_id = next_conn_id++;
s->state = SESSION_STATE_INIT;

/* Store in hash by key for fast lookup */
hash_set(session_by_key, flow_key, session_index);

/* Fast-path lookup: key → index → pointer */
uword *val = hash_get(session_by_key, flow_key);
if (val) {
    s = pool_elt_at_index(sessions, *val);
    /* s is now valid - use it */
}

/* Destroy a session */
hash_unset(session_by_key, flow_key);
pool_put(sessions, s);            /* marks slot free */

/* Walk all active sessions (e.g., for timeout sweep) */
pool_foreach(s, sessions) {
    if (now - s->last_seen > SESSION_TIMEOUT)
        expire_session(sessions, session_by_key, s);
}

💡 Deriving an index from a pointer: The idiom index = element_ptr - pool_base_ptr is idiomatic in VPP. It works because pool elements are contiguous. This index is stable even if the pool grows (existing elements don't move). Always store the index in inter-subsystem references (e.g., in a buffer's opaque field), never the pointer.

BIHASH - BOUNDED-INDEX EXTENSIBLE HASHING (src/vppinfra/bihash_*.h)

🔍

Bihash Architecture - Two-Level Lookup

INTERNALS

Bihash is VPP's primary hash table for dataplane lookups. Its design is optimised for the read-heavy, write-rare workload of packet forwarding: millions of lookups per second with occasional control-plane insertions.

/* Two-level structure */

Level 1 - Bucket Array (always in memory, fits in L2 cache):
  bucket[0]  → page pointer + lock bit
  bucket[1]  → page pointer + lock bit
  ...
  bucket[N-1]→ page pointer + lock bit

  hash(key) & (N-1) → selects bucket index

Level 2 - KV Pages (per-bucket, allocated on demand):
  page = [ kvp[0] | kvp[1] | kvp[2] | kvp[3] | ... ]
         (BIHASH_KVP_PER_PAGE entries, default 4 or 8)

Lookup:
  1. bucket_idx = hash(key) & (N-1)          O(1) - bitmask
  2. page = bucket[bucket_idx].page           O(1) - pointer deref
  3. linear scan page for matching key        O(1) - ≤8 compares

When a bucket fills (all KVP slots taken), it overflows to a chain of pages. The chain length is bounded by the load factor at init time. This gives worst-case O(chain_len) lookup - not O(n) like a chaining hash table.

The lock bit in the bucket enables a single-writer, multi-reader protocol: readers spin on the lock bit; the writer sets it, modifies the page, clears it. Readers detect inconsistency via the version counter and retry.

📦

Bihash Variants - Choosing the Right One

VARIANTS

Bihash is a template implemented via macros. The type name encodes key_size_value_size in bytes:

TypeKeyValueTypical Use in VPP
bihash_8_88 bytes (u64)8 bytes (u64)NAT4 simple lookup, sw_if_index tables, ARP cache
bihash_16_816 bytes8 bytesNAT44 endpoint-dependent (src_ip+src_port+protocol+vrf)
bihash_48_848 bytes8 bytesNAT66, IPv6 session tables, 5-tuple flow tables
bihash_24_824 bytes8 bytesMPLS, L2 FIB with bridge domain
bihash_40_840 bytes8 bytesVXLAN tunnel tables (src+dst+vni)
/* Include the specific variant you need */
#include "vppinfra/bihash_8_8.h"
#include "vppinfra/bihash_template.h"  /* defines BV() macro */

/* BV() prepends the type name: BV(clib_bihash_init) → clib_bihash_8_8_init */
typedef clib_bihash_8_8_t my_hash_t;

/* Key-value pair type */
clib_bihash_kv_8_8_t kv;   /* kv.key (u64), kv.value (u64) */

/* Initialise (call once, control plane) */
clib_bihash_8_8_t h;
u32 nbuckets = 64 * 1024;   /* power of 2, tuned to expected entries */
u32 mem_bytes = 128 * 1024 * 1024;  /* 128 MB backing store */
clib_bihash_init_8_8(&h, "my-flow-table", nbuckets, mem_bytes);

Fast-Path Lookup and Insert Patterns

DATAPLANE
/* ── LOOKUP (fast path - called per packet) ── */
clib_bihash_kv_8_8_t kv;

/* Pack key: for 5-tuple flows you'd pack into 8 bytes */
kv.key = ((u64)src_addr << 32) | dst_addr;   /* example: src+dst IP */

if (PREDICT_TRUE(
    clib_bihash_search_8_8(&h, &kv, &kv) == 0))
{
    /* Hit - kv.value is the stored value (e.g. session pool index) */
    u32 session_idx = kv.value;
    my_session_t *s = pool_elt_at_index(sessions, session_idx);
    /* ... process packet using s */
}
else {
    /* Miss - new flow, create session */
    goto slow_path;
}

/* ── INSERT (slow path / control plane) ── */
clib_bihash_kv_8_8_t kv;
kv.key   = ((u64)src_addr << 32) | dst_addr;
kv.value = session_idx;                       /* pool index */
clib_bihash_add_del_8_8(&h, &kv, 1 /* is_add */);

/* ── DELETE ── */
kv.key = ((u64)src_addr << 32) | dst_addr;
kv.value = 0;                                 /* value irrelevant for delete */
clib_bihash_add_del_8_8(&h, &kv, 0 /* is_add=0 means delete */);
⚙️ DPDK PARALLEL - rte_hash vs bihash
  • rte_hash uses Cuckoo hashing with SIMD key comparison - excellent for fixed-size lookups but requires explicit locking for concurrent writers
  • bihash uses a per-bucket lock bit so the read path is lock-free - readers never acquire a lock, they just check the version counter. This matters at 10 Mpps where lock contention would be catastrophic
  • Both are pre-allocated with fixed backing memory. If bihash fills beyond capacity, lookups degrade (longer page chains) but do not crash - rte_hash returns -ENOSPC
  • For your session table work: bihash_48_8 is the right choice for full 5-tuple IPv4 flows (src_ip 4B + dst_ip 4B + src_port 2B + dst_port 2B + proto 1B = 13B, padded to 48B for alignment)

MEMORY, FORMAT/UNFORMAT, TIMERS

🖨️

format / unformat - Extensible I/O

FORMAT

format is VPP's printf replacement. Instead of writing to a fixed buffer, it appends to a u8 * vec, growing as needed. The %U specifier allows any function with the right signature to be used as a format directive - this is how VPP achieves composable trace output.

/* format signature: u8 *format(u8 *s, const char *fmt, ...); */
/* Returns the u8-vec with formatted output appended */

u8 *s = 0;   /* start with empty vec */
s = format(s, "Interface %d IP: %U\n",
           sw_if_index,
           format_ip4_address, &my_addr);  /* %U calls format_ip4_address */
vlib_cli_output(vm, "%v", s);             /* %v = print u8-vec */
vec_free(s);

/* Writing your own format function */
static u8 * format_my_flow(u8 *s, va_list *args) {
    my_flow_t *f = va_arg(*args, my_flow_t *);
    s = format(s, "[%U:%d → %U:%d proto %d]",
               format_ip4_address, &f->src, f->src_port,
               format_ip4_address, &f->dst, f->dst_port,
               f->proto);
    return s;
}

/* Use it anywhere */
s = format(0, "Flow: %U\n", format_my_flow, &my_flow);

/* unformat - parsing */
unformat_input_t input;
unformat_init_string(&input, "192.168.1.1");
ip4_address_t addr;
if (unformat(&input, "%U", unformat_ip4_address, &addr))
    vlib_cli_output(vm, "Parsed: %U\n", format_ip4_address, &addr);
  • format(0, ...) allocates a new vec - caller must vec_free it
  • format(existing_vec, ...) appends to existing vec
  • %v - print a u8 * vec as a string
  • %U - call a custom format function
  • Every packet trace function uses format - learn this before writing your first plugin
⏱️

clib_time and Timing Primitives

TIMERS
/* High-resolution time - based on TSC (rdtsc) */
clib_time_t ct;
clib_time_init(&ct);

f64 now = clib_time_now(&ct);        /* seconds since init, f64 */
u64 cycles = clib_cpu_time_now();    /* raw TSC cycles */

/* In graph nodes: use vlib_time_now() which is pre-computed per dispatch */
f64 now = vlib_time_now(vm);  /* preferred in node functions */

/* Timer wheel (tw_timer_*.h) for protocol timeouts */
/* Used for TCP retransmit timers, NAT session expiry */
#include "vppinfra/tw_timer_2t_1w_2048sl.h"
/* Parameters: 2 timers/object, 1 wheel, 2048 slots */
TWT(tw_timer_wheel) tw;
tw_timer_wheel_init_2t_1w_2048sl(&tw, expired_cb, 1.0, ~0);

For session timeouts in your Stateful Connection Tracker project (Mini-Project 7), use tw_timer_* - it handles expiry callbacks at O(1) per tick regardless of the number of active timers.

🧮

clib_mem - Memory Allocation

MEMORY
/* Always use clib_mem_*, never malloc/free in VPP code */

/* Basic allocation */
void *p = clib_mem_alloc(size);
void *p = clib_mem_alloc_aligned(size, CLIB_CACHE_LINE_BYTES);  /* 64-byte aligned */
clib_mem_free(p);

/* NUMA-aware allocation */
clib_mem_set_numa_affinity(numa_node);  /* set before alloc */
void *p = clib_mem_alloc_aligned(size, CLIB_CACHE_LINE_BYTES);
clib_mem_set_default_numa_affinity();   /* reset */

/* Heap introspection */
clib_mem_usage_t usage;
clib_mem_get_heap_usage(clib_mem_get_heap(), &usage);
/* usage.bytes_used, usage.bytes_free */

/* For per-worker allocations: use per-thread heaps */
/* vlib sets up per-thread heaps automatically */
void *old_heap = clib_mem_set_heap(vm->thread_main->heap);
/* allocate on worker-local heap */
clib_mem_set_heap(old_heap);   /* restore */

CLIB_CACHE_LINE_BYTES is 64 on x86_64. Always align per-worker data structures to cache lines to avoid false sharing between worker threads - a common source of hidden performance problems in multi-threaded VPP plugins.

P2A COMPLETION CHECKLIST

✅ When complete: move to P2B - vlib. You now know the data structures. vlib is where you learn how those structures are used in the graph dispatcher - the engine that drives VPP.

← P1 Foundation 🗺️ Roadmap Next: vlib →