NETWORKING MASTERY · PHASE 1 · MODULE 02 · WEEKS 1–2
🔌 Ethernet and Layer 2
MAC addressing · Ethernet frames · ARP · Switching · VLANs · STP · RSTP
Beginner Prerequisite: M01 IEEE 802.3 802.1Q VLANs 802.1D STP 3 Labs

WHAT IS ETHERNET AND WHY IT DOMINATES

📜

Ethernet — A Brief History

BACKGROUND

Ethernet was invented at Xerox PARC in 1973 by Robert Metcalfe and David Boggs. It was standardised as IEEE 802.3 in 1983 and has since become the dominant wired networking technology on the planet. Today it runs at speeds from 10 Mbps (historical) to 400 Gbps (data centre), over copper cable, optical fibre, and even backplane connections inside chassis switches.

Why has Ethernet survived for 50+ years? Because it is simple, cheap, and extensible. The core frame format has barely changed since 1983. The same Ethernet frame that worked on a 10 Mbps coaxial cable in 1985 is structurally identical to the one flying over a 100 Gbps fibre link today.

  • Ubiquity: Every laptop, server, router, switch, and data-plane NIC speaks Ethernet natively
  • Scalability: Speed has scaled 10,000× (10 Mbps → 100 Gbps) without changing the fundamental frame format
  • Cost: Commodity Ethernet hardware (NICs, switches) is extremely cheap compared to alternatives
  • Simplicity: The protocol is well-understood and easy to implement and debug

Layer 2 — What It Does in the Stack

ROLE IN STACK

As you learned in M01, Layer 2 (Data Link) handles node-to-node delivery on the same network segment. The key word is "same network" — Layer 2 only moves frames between devices that are directly connected (or connected through switches). When a packet needs to cross to a different network, Layer 3 (IP routing) takes over.

A useful mental model: Layer 2 is the local delivery van, Layer 3 is the national courier. The van moves parcels within a city (your LAN). When a parcel needs to go cross-country, the national courier (IP routing) takes it to the destination city, then a local van (another L2 network) makes the final delivery.

Layer 2 devices on a typical network:

  • NIC (Network Interface Card) — every host has one; generates and receives Ethernet frames
  • Switch — forwards frames between ports using MAC address learning; operates entirely at L2
  • Bridge — an older device connecting two network segments; conceptually the same as a 2-port switch
  • Access Point (WiFi) — bridges WiFi (802.11) frames to Ethernet (802.3) frames
📡

Ethernet Speed Standards

STANDARDS
StandardSpeedMediumMax DistanceCommon Use
10BASE-T10 MbpsCat3/Cat5 copper100 mLegacy — almost extinct
100BASE-TX100 MbpsCat5e copper100 mOld office networks
1000BASE-T1 GbpsCat5e/Cat6 copper100 mDesktops, home networks
10GBASE-T10 GbpsCat6A copper100 mServer uplinks, data centres
10GBASE-SR10 GbpsMulti-mode fibre300 mData centre racks
25GBASE-SR25 GbpsMulti-mode fibre100 mServer NICs (your Mellanox)
100GBASE-SR4100 GbpsMulti-mode fibre100 mSpine switches, DPDK servers
400GBASE-DR4400 GbpsSingle-mode fibre500 mHyperscale data centres

For your DPDK and VPP work, you're most likely working with 10G, 25G, or 100G Ethernet over SFP+/QSFP28 optical modules on Mellanox ConnectX NICs.

ETHERNET FRAME FORMAT — BYTE BY BYTE

📦

The Ethernet II Frame

FRAME FORMAT

There are two Ethernet frame formats in use: Ethernet II (DIX) and IEEE 802.3. Ethernet II is dominant on modern networks — it's what you'll see in every packet capture. The key difference is the 2-byte field after the MAC addresses: Ethernet II uses it as an EtherType (identifies the L3 protocol), while 802.3 uses it as a Length field. Since EtherType values are always ≥ 1536 (0x0600) and length values are ≤ 1500, a receiver can tell them apart instantly.

On wire
Preamble
7 bytes
SFD
1 byte
Destination MAC
6 bytes
Source MAC
6 bytes
EtherType
2 bytes
Payload (Data)
46–1500 bytes
CRC/FCS
4 bytes
Total: 64–1518 bytes (minimum 64B to detect collisions, max 1518B standard MTU)
🔍

Every Field Explained

FIELD REFERENCE

Preamble (7 bytes) + SFD (1 byte)

The preamble is 7 bytes of alternating 1s and 0s (10101010...). It allows the receiver's clock to synchronise with the sender's clock before actual data arrives — like a "ready?" signal. The Start Frame Delimiter (SFD) is 10101011 — the final bit breaks the alternating pattern to signal "frame starts NOW". These 8 bytes are added and stripped by the NIC hardware and never appear in software packet buffers.

Destination MAC Address (6 bytes)

The hardware address of the intended recipient. The switch uses this to decide which port to forward the frame to. Three special cases:

  • Unicast — sent to one specific device (LSB of first byte = 0)
  • BroadcastFF:FF:FF:FF:FF:FF — all devices on the segment receive it
  • Multicast01:00:5E:xx:xx:xx for IPv4 multicast — sent to a group of devices

Source MAC Address (6 bytes)

The hardware address of the sender. Switches read this field to learn which MAC address is reachable on which port and build their MAC address table.

EtherType (2 bytes)

Identifies the Layer 3 protocol carried in the payload. Most important values:

  • 0x0800 — IPv4 payload
  • 0x0806 — ARP payload
  • 0x86DD — IPv6 payload
  • 0x8100 — 802.1Q VLAN tag (frame is VLAN-tagged)
  • 0x88CC — LLDP (Link Layer Discovery Protocol)
  • 0x8847 — MPLS unicast

In DPDK and VPP, the EtherType field is the first thing the ethernet-input graph node reads to dispatch the frame to the correct next node (ip4-input, ip6-input, arp-input, etc.).

Payload / Data (46–1500 bytes)

The IP packet (or ARP message, or other L3 PDU) carried by the frame. The minimum payload is 46 bytes — if the IP packet is smaller, it gets padded with zeros to reach 46 bytes. This ensures the total frame is at least 64 bytes, which is required for collision detection in half-duplex Ethernet (legacy).

FCS / CRC (4 bytes)

Frame Check Sequence — a 32-bit CRC (Cyclic Redundancy Check) computed over all frame fields from Destination MAC through Payload. The receiver recomputes the CRC and compares to the transmitted value. If they differ, the frame is silently dropped (no error is sent back — error recovery is TCP's job at L4). NICs typically handle CRC computation in hardware, and most OSes strip the FCS before passing the frame to software — so you won't see it in Wireshark captures from a NIC in normal mode.

MTU — Maximum Transmission Unit

The maximum payload size is 1500 bytes — this is the standard Ethernet MTU. If an IP packet is larger, it must be fragmented at the IP layer (or the application told to send smaller chunks via Path MTU Discovery). Many data-centre networks use Jumbo Frames with MTU 9000 bytes to reduce CPU overhead for large transfers — your DPDK/VPP setup likely uses jumbo frames.

💡 Why minimum 64 bytes? In classic CSMA/CD Ethernet (before full-duplex switches), a sending station needed to keep transmitting long enough that if a collision occurred at the far end of the cable, the collision signal could travel back and reach the sender while it was still transmitting. At 10 Mbps on a 100m cable, this required a minimum frame size of 64 bytes. Modern switched full-duplex Ethernet has no collisions, but the 64-byte minimum is kept for backwards compatibility.

📊

Frame Overhead Calculation

PERFORMANCE

Understanding Ethernet overhead is essential for data-plane performance engineering:

/* Ethernet frame overhead breakdown */
Preamble + SFD :  8 bytes  (NIC hardware only — not in software buffer)
Ethernet header: 14 bytes  (dst MAC 6 + src MAC 6 + EtherType 2)
FCS/CRC        :  4 bytes  (usually stripped by NIC)
Interframe Gap :  12 bytes (minimum idle time between frames — layer 1)
─────────────────────────
Wire overhead  :  38 bytes per frame (preamble + header + FCS + IFG)

/* Efficiency at minimum frame size (64 bytes) */
Payload bytes  : 46 bytes (64 - 14 header - 4 FCS = 46)
Wire bytes     : 64 + 8 preamble + 12 IFG = 84 bytes total
Efficiency     : 46 / 84 = 54.8%  ← terrible! lots of overhead for small pkts

/* Efficiency at maximum frame size (1518 bytes) */
Payload bytes  : 1500 bytes
Wire bytes     : 1518 + 8 + 12 = 1538 bytes
Efficiency     : 1500 / 1538 = 97.5%  ← much better

/* This is WHY jumbo frames (MTU 9000) help in data centres */
/* Fewer frames per byte = less header processing overhead */

MAC ADDRESSES — HARDWARE IDENTITY

🏷️

What is a MAC Address?

CORE CONCEPT

A MAC (Media Access Control) address is a 48-bit (6-byte) hardware identifier assigned to every network interface. Unlike IP addresses which are logical and can be changed by software, MAC addresses are (traditionally) burned into the NIC's hardware at manufacture and intended to be globally unique. In practice, modern OSes allow MAC address spoofing in software.

MAC address notation: Written as 6 pairs of hexadecimal digits, separated by colons or hyphens:

  • aa:bb:cc:dd:ee:ff — colon-separated (Linux, most tools)
  • AA-BB-CC-DD-EE-FF — hyphen-separated (Windows)
  • aabb.ccdd.eeff — dot-separated groups of 4 (Cisco)
🗂️

MAC Address Structure — OUI and NIC-Specific

STRUCTURE

A MAC address has a precise internal structure:

← OUI (Organisationally Unique Identifier) → ← NIC-Specific →
aa
byte 1
:
bb
byte 2
:
cc
byte 3
dd
byte 4
:
ee
byte 5
:
ff
byte 6
  • OUI (bytes 1–3) — Assigned by IEEE to each NIC manufacturer. Identifies the vendor. Examples: 00:1A:2B = Cisco, 24:8A:07 = Mellanox/NVIDIA, 3C:FD:FE = Intel. You can look up any OUI at https://regauth.standards.ieee.org/
  • NIC-specific (bytes 4–6) — Assigned by the manufacturer to uniquely identify the specific interface within all their products

Two special bits in byte 1:

  • Bit 0 (LSB) — I/G bit (Individual/Group): 0 = unicast (sent to one device), 1 = multicast/broadcast (sent to a group)
  • Bit 1 — U/L bit (Universal/Local): 0 = globally unique (burned-in OUI), 1 = locally administered (manually assigned or randomly generated)
/* Reading MAC address bits in C (network byte order) */
uint8_t mac[6] = {0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff};

/* Check if unicast or multicast/broadcast */
if (mac[0] & 0x01)
    printf("Multicast or broadcast\n");
else
    printf("Unicast\n");

/* Check if globally or locally administered */
if (mac[0] & 0x02)
    printf("Locally administered MAC\n");
else
    printf("Globally unique (OUI assigned)\n");

/* Broadcast check: all bytes == 0xFF */
if (memcmp(mac, "\xff\xff\xff\xff\xff\xff", 6) == 0)
    printf("Broadcast frame\n");
🔧

Working with MAC Addresses on Linux

PRACTICAL
# Show MAC address of all interfaces
ip link show
# Output: link/ether aa:bb:cc:dd:ee:ff brd ff:ff:ff:ff:ff:ff

# Show just eth0's MAC
ip link show eth0 | awk '/ether/ {print $2}'

# Show MAC in /sys filesystem (useful in scripts)
cat /sys/class/net/eth0/address

# Temporarily spoof/change MAC address
ip link set eth0 down
ip link set eth0 address 02:00:00:00:00:01
ip link set eth0 up
# Note: bit 1 of first byte = 1 (locally administered)

# Show neighbour (ARP) table — maps IP → MAC
ip neigh show

# Show ARP table with arp command (older)
arp -n

# In Wireshark: filter by MAC
# eth.dst == aa:bb:cc:dd:ee:ff
# eth.src == aa:bb:cc:dd:ee:ff
# eth.addr == aa:bb:cc:dd:ee:ff  (src OR dst)

ARP — ADDRESS RESOLUTION PROTOCOL

🔍

The Problem ARP Solves

MOTIVATION

Layer 3 (IP) routes packets using logical IP addresses. Layer 2 (Ethernet) delivers frames using physical MAC addresses. When your computer wants to send data to another device on the same local network, it knows the destination's IP address (from DNS or configuration), but the Ethernet hardware needs a MAC address to build the frame. ARP bridges this gap.

ARP's job in one sentence: Given an IP address on the local network, tell me the MAC address of the device that owns it.

📢 Analogy — Shouting in a Room

Imagine you're in a room full of people. You know your friend's name ("10.0.0.5") but not their face (MAC address). You shout: "Hey everyone — I'm looking for 10.0.0.5, please tell me who you are!" Only the person with that name raises their hand and says "That's me! My face looks like aa:bb:cc:dd:ee:ff". Everyone else ignores your shout. You now know their face and can walk up and talk directly. This is exactly how ARP works — the broadcast is the shout, the ARP reply is the hand raised.

📋

ARP Packet Format

PACKET FORMAT

ARP is carried directly in an Ethernet frame with EtherType 0x0806. The ARP message itself has a fixed format:

ARP msg
Hardware Type
2B (1=Eth)
Protocol Type
2B (0x0800=IPv4)
HLen
1B (6)
PLen
1B (4)
Operation
2B (1=req,2=rep)
Sender MAC
6 bytes
Sender IP
4 bytes
Target MAC
6 bytes (0s in req)
Target IP
4 bytes

The Operation field distinguishes requests (1) from replies (2). In a request, the Target MAC is 00:00:00:00:00:00 (unknown — that's what we're asking for). The entire request is sent as an Ethernet broadcast (FF:FF:FF:FF:FF:FF).

🔄

ARP Exchange — Step by Step

PROCESS

Scenario: Host A (10.0.0.5 / aa:aa:aa:aa:aa:aa) wants to send a packet to Host B (10.0.0.10). It doesn't know B's MAC address yet.

A checks its ARP cache
Before sending an ARP request, the OS checks its in-memory ARP cache (a table of IP→MAC mappings from recent exchanges). If there's a valid entry for 10.0.0.10, skip directly to step 5. ARP cache entries typically expire after 60 seconds (Linux default).
$ ip neigh show | grep 10.0.0.10
A sends ARP Request — broadcast
A constructs an ARP request: Operation=1 (request), Sender MAC=aa:aa:aa:aa:aa:aa, Sender IP=10.0.0.5, Target MAC=00:00:00:00:00:00, Target IP=10.0.0.10. Wraps it in an Ethernet frame with dst MAC=FF:FF:FF:FF:FF:FF (broadcast). Every device on the segment receives this frame.
Ethernet: dst=FF:FF:FF:FF:FF:FF src=aa:aa:aa:aa:aa:aa type=0x0806
All hosts receive the broadcast — only B responds
Every device on the segment receives the broadcast frame and reads the ARP message. Each device checks if its IP matches the Target IP (10.0.0.10). Only Host B matches — all others silently discard the ARP request.
B sends ARP Reply — unicast
B constructs an ARP reply: Operation=2 (reply), Sender MAC=bb:bb:bb:bb:bb:bb, Sender IP=10.0.0.10, Target MAC=aa:aa:aa:aa:aa:aa, Target IP=10.0.0.5. This is sent as a unicast Ethernet frame directly to A (not broadcast).
Ethernet: dst=aa:aa:aa:aa:aa:aa src=bb:bb:bb:bb:bb:bb type=0x0806
A caches the result and sends the original packet
A stores 10.0.0.10 → bb:bb:bb:bb:bb:bb in its ARP cache. It now builds the IP packet with Ethernet dst=bb:bb:bb:bb:bb:bb and transmits it. All subsequent packets to 10.0.0.10 use the cached entry without another ARP exchange (until it expires).
Ethernet: dst=bb:bb:bb:bb:bb:bb src=aa:aa:aa:aa:aa:aa type=0x0800 → [IP packet]
⚠️

ARP Security Issues — Gratuitous ARP and ARP Spoofing

SECURITY

ARP has no authentication — any device can send an ARP reply claiming to own any IP address. This makes it vulnerable to two important attacks that NGFW engineers must understand:

Gratuitous ARP

A gratuitous ARP is an unsolicited ARP reply — a device announces its own IP→MAC mapping without being asked. This is used legitimately by OSes at startup (to update neighbour caches) and by failover systems (to redirect traffic to a new IP owner after failover). But an attacker can send a gratuitous ARP to poison every device's ARP cache on the segment.

ARP Spoofing / ARP Poisoning

An attacker sends forged ARP replies claiming "I am the gateway (10.0.0.1) — my MAC is aa:at:ta:ck:er:00". Every host that receives this updates its ARP cache. Now all traffic intended for the gateway is sent to the attacker. The attacker can forward it on (man-in-the-middle) or drop it (denial of service).

NGFW mitigation techniques:

  • Dynamic ARP Inspection (DAI) — switch feature that validates ARP packets against a DHCP snooping binding table
  • Static ARP entries — manually configure critical IP→MAC mappings on sensitive hosts
  • ARP rate limiting — limit the rate of ARP requests per port to detect scanning
  • IPSG (IP Source Guard) — validates that source IP and MAC match the DHCP binding table

ETHERNET SWITCHING AND MAC ADDRESS LEARNING

🔄

How a Switch Works

INTERNALS

A switch is a Layer 2 device that connects multiple Ethernet devices and forwards frames between them intelligently — sending each frame only to the port where the destination MAC address is reachable, rather than flooding to all ports like an old hub.

Switches maintain a MAC Address Table (also called the CAM table — Content-Addressable Memory). This table maps MAC addresses to switch ports. It is built dynamically through MAC address learning.

MAC Learning — How the Table Gets Built

When a switch receives a frame on port X from source MAC aa:bb:cc:dd:ee:ff, it records: "MAC aa:bb:cc:dd:ee:ff is reachable on port X". It does this for every frame it receives — gradually building a complete map of which MAC address is behind which port.

Frame Forwarding Decision

When a switch receives a frame, it makes one of three decisions based on the destination MAC:

  • Known unicast — destination MAC is in the MAC table → forward ONLY to the listed port
  • Unknown unicast — destination MAC is NOT in the table → flood to ALL ports except the port it arrived on (this is how new MACs get discovered)
  • Broadcast/Multicast — destination is FF:FF:FF:FF:FF:FF or multicast → flood to ALL ports except the arrival port
📋

MAC Address Table — Worked Example

EXAMPLE

A switch has 4 ports. Three hosts are connected. The table starts empty.

Host A
MAC: aa:aa:aa:aa:aa:aa
Port 1
Switch
4-port
Host B
MAC: bb:bb:bb:bb:bb:bb
Port 2
Host C
MAC: cc:cc:cc:cc:cc:cc
Port 3

Step 1: Host A sends a frame to Host B. Switch receives frame on port 1:

MAC AddressPortAgeAction
aa:aa:aa:aa:aa:aa10sLearned from frame source

Destination bb:bb:bb:bb:bb:bb is unknown → flood to ports 2 and 3. Host B receives it (port 2). Host C receives it but discards (not its MAC).

Step 2: Host B replies to Host A. Switch receives on port 2:

MAC AddressPortAge
aa:aa:aa:aa:aa:aa15s
bb:bb:bb:bb:bb:bb20s

Destination aa:aa:aa:aa:aa:aa is NOW known → forward only to port 1. Host C receives nothing.

After a few more exchanges — full table, all forwarding is unicast:

MAC AddressPortNotes
aa:aa:aa:aa:aa:aa1Host A — all frames for A go to port 1 only
bb:bb:bb:bb:bb:bb2Host B — all frames for B go to port 2 only
cc:cc:cc:cc:cc:cc3Host C — all frames for C go to port 3 only

Entries age out (typically 300 seconds) to handle moved devices. If a device is physically moved to a different port, the switch learns the new port when it next sends a frame, overwriting the old entry.

🔗

Hub vs Switch vs Router — Key Differences

COMPARISON
DeviceOSI LayerForwarding LogicCollision DomainBroadcast Domain
HubL1Repeats all bits to all ports — no intelligenceAll ports share oneAll ports share one
SwitchL2Forwards frames by MAC address — per-portOne per port (full-duplex)All ports share one (unless VLANs used)
RouterL3Routes packets by IP address between networksOne per portOne per port — breaks broadcast domains

💡 Broadcast domain matters for performance. Every ARP request, every DHCP broadcast, every Spanning Tree BPDU floods the entire broadcast domain. A single broadcast domain with 1000 hosts means every host must process every broadcast from all 999 others. VLANs split the broadcast domain — critical for large networks. This is exactly why VLANs exist.

VLANs — VIRTUAL LOCAL AREA NETWORKS (IEEE 802.1Q)

🧱

What Problem VLANs Solve

MOTIVATION

A VLAN (Virtual LAN) allows you to logically divide a single physical switch into multiple isolated broadcast domains. Without VLANs, all ports on a switch share one broadcast domain — every ARP, DHCP, and broadcast packet hits every port. With VLANs, each VLAN is its own isolated segment; broadcasts in VLAN 10 never reach VLAN 20.

Key benefits:

  • Security isolation — HR hosts in VLAN 10 cannot communicate at L2 with Engineering in VLAN 20 (even on the same physical switch)
  • Broadcast control — reduces broadcast noise and scales large networks
  • Simplified management — move a host between VLANs by reconfiguring the switch port, not physically moving cables
  • Traffic segmentation — critical for NGFW deployments (separate VLAN per security zone: inside, DMZ, outside)
🏷️

802.1Q VLAN Tagging

FRAME FORMAT

IEEE 802.1Q adds a 4-byte VLAN tag to the Ethernet frame between the Source MAC and the EtherType field. The EtherType 0x8100 signals that a VLAN tag follows:

Tagged frame
Dst MAC
6 bytes
Src MAC
6 bytes
0x8100
2B (TPID)
PCP DEI
3+1 bits
VLAN ID
12 bits (0–4095)
EtherType
2B (0x0800)
Payload
46–1500 bytes
CRC
4 bytes

VLAN tag fields:

  • TPID (Tag Protocol Identifier, 0x8100) — identifies this as an 802.1Q tagged frame
  • PCP (Priority Code Point, 3 bits) — 802.1p QoS priority 0–7 (7=highest). Used by switches to prioritise traffic
  • DEI (Drop Eligible Indicator, 1 bit) — marks frames that can be dropped under congestion
  • VID (VLAN Identifier, 12 bits) — VLAN number 0–4095. VLAN 0 = untagged/no VLAN, VLAN 1 = default, VLAN 4095 = reserved. Usable range: 1–4094

The VLAN tag increases the maximum frame size from 1518 to 1522 bytes.

🔌

Access Ports vs Trunk Ports

PORT TYPES

Access Port

Connects an end device (server, PC) to the switch. The device sends untagged frames — it doesn't know or care about VLANs. The switch adds a VLAN tag when the frame enters (based on port configuration) and strips it before sending back to the device. The device never sees the VLAN tag.

Used for: Server NICs, workstation Ethernet ports, printer ports, access-layer switch ports

Trunk Port

Connects a switch to another switch (or a router, or a NIC configured for trunking). Carries frames for multiple VLANs simultaneously, each identified by its 802.1Q tag. Both sides see and process the tags. A trunk port is commonly used for:

Used for: Switch-to-switch uplinks, switch-to-router links, server NICs where the OS needs multiple VLANs, VPP/DPDK setups with VLAN subinterfaces

VLAN 10 — Engineering
Port 1 (access) — Server 1
Port 2 (access) — Server 2
Port 8 (trunk) — carries VLAN10+20+30
VLAN 20 — HR
Port 3 (access) — HR PC 1
Port 4 (access) — HR PC 2
Port 8 (trunk) — carries VLAN10+20+30

Hosts in VLAN 10 and VLAN 20 cannot communicate at L2 even though they're on the same physical switch. To route between VLANs you need a router or Layer 3 switch ("router on a stick").

🖥️

VLANs in Linux and DPDK

PRACTICAL
# Create VLAN subinterface on Linux (eth0 = trunk port)
ip link add link eth0 name eth0.10 type vlan id 10
ip link add link eth0 name eth0.20 type vlan id 20
ip link set eth0.10 up
ip link set eth0.20 up
ip addr add 10.10.0.1/24 dev eth0.10
ip addr add 10.20.0.1/24 dev eth0.20

# Show VLAN info
cat /proc/net/vlan/config
ip -d link show eth0.10   # shows vlan id, proto 802.1Q

# In VPP — create VLAN subinterface on a DPDK port
# vppctl: create sub-interfaces GigabitEthernet0/8/0 10
# vppctl: set interface state GigabitEthernet0/8/0.10 up
# vppctl: set interface ip address GigabitEthernet0/8/0.10 10.10.0.1/24

# Wireshark VLAN filter
# vlan.id == 10         — show only VLAN 10 frames
# vlan.priority == 6    — show high-priority tagged frames

SPANNING TREE PROTOCOL (STP) AND RSTP — LOOP PREVENTION

⚠️

The Broadcast Storm Problem

THE PROBLEM

Networks need redundant links for fault tolerance — if one cable fails, traffic should route around it automatically. But redundant L2 links create loops, and loops in a switched network are catastrophic:

  • A broadcast frame (e.g., ARP request) sent into the loop never dies — switches forward it in circles forever (unlike IP packets which have TTL)
  • The loop multiplies the frame — it arrives on multiple ports simultaneously, triggering additional floods
  • Within milliseconds, the network saturates at 100% utilisation — nothing else can pass
  • Switch MAC tables thrash constantly — the same MAC appears on different ports simultaneously, causing incorrect forwarding

This is called a broadcast storm. It will take down an entire network in seconds. STP prevents this by blocking redundant links unless a primary link fails.

🌲

STP — How It Works

MECHANISM

STP (IEEE 802.1D) prevents loops by automatically blocking redundant ports while keeping one active path between any two network points — a loop-free logical tree topology. STP runs between switches automatically using special messages called BPDUs (Bridge Protocol Data Units).

STP Election Process — 3 Steps

  1. Elect a Root Bridge — All switches exchange BPDUs and elect one switch as the "root" of the spanning tree. The switch with the lowest Bridge ID wins. Bridge ID = Priority (default 32768) + MAC address. You can manually set priority lower to control which switch becomes root.
  2. Elect Root Ports — Each non-root switch selects one root port: the port with the lowest-cost path to the root bridge. Cost is based on link speed (higher speed = lower cost: 10 Gbps = cost 2, 1 Gbps = cost 4, 100 Mbps = cost 19).
  3. Elect Designated Ports and Block Others — For each network segment, one switch port is elected as the designated port (the one closest to root). All other ports on that segment are put into blocking state — they receive but do not forward frames. This breaks all loops.
🔄

STP Port States

PORT STATES
Blocking
Receives BPDUs. Does NOT forward frames. Does NOT learn MACs. Prevents loops. Duration: indefinite while loop exists.
Listening
Transitioning to forwarding. Processes BPDUs. Does NOT forward frames. Does NOT learn MACs. Duration: 15 seconds (Forward Delay).
Learning
Processes BPDUs. Does NOT forward frames. DOES learn MAC addresses (builds table without forwarding). Duration: 15 seconds (Forward Delay).
Forwarding
Fully active. Processes BPDUs. Forwards frames. Learns MACs. This is the normal operational state for active ports.
Disabled
Administratively shut down. Does nothing. No BPDUs, no learning, no forwarding.

⚠️ STP convergence takes up to 50 seconds. When a link fails, STP must detect it (20s Max Age) and transition blocked ports through Listening (15s) and Learning (15s) states before forwarding. This 50-second outage is unacceptable for modern networks — which is why RSTP was invented.

RSTP — Rapid Spanning Tree Protocol (IEEE 802.1w)

RSTP

RSTP (802.1w, later incorporated into 802.1D-2004) reduces convergence time from 50 seconds to under 1 second in most cases. Key improvements:

  • Proposal/Agreement mechanism — switches negotiate directly with neighbours to synchronise port roles quickly without waiting for timers
  • Edge ports (PortFast) — ports connected to end devices bypass Listening/Learning states entirely and go straight to Forwarding. Reduces startup delay for servers and workstations
  • Simplified port roles — Root, Designated, Alternate (backup for root port), Backup (backup for designated)
  • Backward compatible — RSTP falls back to STP mode when it detects an old STP switch

BPDU Guard — a security feature on edge ports: if a BPDU is received on an edge port (someone plugged in a switch), the port is immediately shut down. Essential for NGFW deployments to prevent unauthorised switch insertion.

LAB 1

Dissect an Ethernet Frame with Wireshark

Objective: Capture real Ethernet frames and identify every field — preamble (simulated), dst/src MAC, EtherType, payload, CRC. Also observe a VLAN-tagged frame if your environment supports it.

1
Open Wireshark. Start a capture on your active interface. In another terminal: ping -c 5 8.8.8.8. Stop the capture.
2
Filter for ARP traffic: type arp in the filter bar. Find an ARP request. Expand the "Ethernet II" section in the packet detail pane. Record: Destination MAC, Source MAC, EtherType value.
3
Now filter for icmp. Find a ping packet. Expand Ethernet II again — note EtherType is 0x0800 (IPv4). Then expand the IPv4 section, then ICMP. This shows the full encapsulation: Ethernet → IP → ICMP.
4
In the hex dump at the bottom of Wireshark, click on different bytes of the Ethernet header. Wireshark highlights the corresponding field. Identify the exact byte offsets of: dst MAC (bytes 0–5), src MAC (bytes 6–11), EtherType (bytes 12–13).
5
Bonus — Read MAC OUI: Right-click a captured frame, select "Protocol Preferences > Name Resolution > Resolve MAC Addresses". Wireshark will show the vendor name next to each MAC address (e.g., "Intel_xx:xx:xx" or "Mellanox_xx:xx:xx").
6
Bonus — VLAN tag: If you have a trunk interface configured, run tcpdump -i eth0 -e -nn vlan. The -e flag shows L2 headers including VLAN tags. You should see output like: vlan 10, ethertype IPv4.
LAB 2

Observe the ARP Exchange

Objective: Capture and decode a complete ARP request/reply exchange. Understand every field in each message.

1
First, flush your ARP cache to force a fresh exchange: sudo ip neigh flush all. This removes all cached IP→MAC mappings.
2
Start a Wireshark capture filtered to ARP: arp in the filter bar. Then in a terminal: ping -c 1 10.0.0.1 (use your actual default gateway IP — find it with ip route | grep default).
3
You should see two ARP packets: the request (broadcast) and the reply (unicast). For the request, record: Sender MAC, Sender IP, Target MAC (should be all zeros), Target IP. For the reply, record: Sender MAC (this is your gateway's MAC!), Sender IP, Target MAC, Target IP.
4
Verify your ARP cache was populated: ip neigh show. You should see your gateway IP with its MAC address and state "REACHABLE".
5
Bonus — ARP spoofing demonstration (on your own VM only): Use Scapy to send a gratuitous ARP: from scapy.all import *; sendp(Ether(dst="ff:ff:ff:ff:ff:ff")/ARP(op=2, pdst="10.0.0.1", hwdst="ff:ff:ff:ff:ff:ff", psrc="10.0.0.1", hwsrc="de:ad:be:ef:00:01"), iface="eth0", count=3). Check ip neigh show on another VM — the gateway's MAC has been poisoned to de:ad:be:ef:00:01.
LAB 3

Build an Ethernet Frame from Scratch in C

Objective: Use a raw socket in C to construct and transmit an Ethernet frame manually — no IP or TCP involved. This gives you a deep understanding of frame structure and prepares you for DPDK raw packet work.

1
Create a file send_arp.c. You will manually build an ARP request using a raw socket (AF_PACKET, SOCK_RAW) and send it on your loopback or virtual interface.
2
The program structure: open a raw socket → get interface index → build the Ethernet header (dst=broadcast, src=your MAC, type=0x0806) → build the ARP payload (op=1, sender MAC/IP, target IP) → call sendto(). Compile with: gcc -o send_arp send_arp.c && sudo ./send_arp eth0 10.0.0.1.
3
Capture the result: run sudo tcpdump -i eth0 -e arp in a parallel terminal before running your program. Verify that your handcrafted ARP request appears in the capture with the exact fields you set in code.
4
Extend it: modify your program to read the ARP reply from the socket. When a reply arrives, parse the Ethernet header (bytes 0–13), then the ARP payload (bytes 14–41) and print the answering device's MAC address. You've just built a minimal ARP resolver in C.
5
Starter code skeleton:
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <net/if.h>
#include <arpa/inet.h>

struct arp_frame {
    /* Ethernet header */
    uint8_t  eth_dst[6];   /* 6 bytes */
    uint8_t  eth_src[6];   /* 6 bytes */
    uint16_t eth_type;     /* 0x0806 for ARP */
    /* ARP payload */
    uint16_t hw_type;      /* 0x0001 = Ethernet */
    uint16_t proto_type;   /* 0x0800 = IPv4 */
    uint8_t  hw_len;       /* 6 */
    uint8_t  proto_len;    /* 4 */
    uint16_t operation;    /* 1 = request */
    uint8_t  sender_mac[6];
    uint8_t  sender_ip[4];
    uint8_t  target_mac[6];
    uint8_t  target_ip[4];
} __attribute__((packed));

int main(int argc, char *argv[]) {
    int sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
    /* TODO: fill frame, sendto(), receive reply */
    return 0;
}

M02 MASTERY CHECKLIST

When complete: Move to M03 - IPv4 Deep Dive. You've now seen the Ethernet frame wrapper — M03 goes inside the payload and dissects the IP header byte-by-byte: addressing, subnetting, fragmentation, TTL, ICMP, and how routers make forwarding decisions.

← M01 OSI and TCP/IP 🗺️ Roadmap Next: M03 - IPv4 →