NETWORKING MASTERY · PHASE 2 · MODULE 06 · WEEK 5
📦 UDP and ICMP
UDP header · Use cases · IGMP · Multicast · ICMP deep dive · Traceroute internals · NGFW policy
Beginner Prerequisite: M03 IPv4, M05 TCP RFC 768 · RFC 792 DNS / VoIP / Gaming 2 Labs

UDP — SIMPLICITY BY DESIGN

🚀

What UDP Is — and Why It Exists

OVERVIEW

UDP (User Datagram Protocol, RFC 768) is the other Layer 4 transport alongside TCP. Where TCP spends its 20+ byte header providing reliability, ordering, and flow control, UDP provides just enough to identify sender and receiver: source port, destination port, length, and checksum — 8 bytes total. That's it.

UDP offers:

  • No connection setup — send a datagram immediately, no handshake overhead
  • No reliability — datagrams can be lost, duplicated, or reordered; UDP won't notice or care
  • No ordering — datagrams arrive in whatever order the network delivers them
  • No flow control — sender can transmit as fast as it wants
  • Message-oriented — one write() = one datagram = one recv(). Unlike TCP's byte stream, UDP preserves message boundaries
  • Low latency — no head-of-line blocking, no retransmit delays

UDP's "limitations" are actually features for the right use cases. DNS needs a single round-trip — TCP's 3-way handshake would be 50% overhead. Video streaming works better with the occasional dropped frame than with a stutter caused by TCP retransmission. Gaming needs the most recent position, not a reliable stream of every old position.

📮 Analogy — Postcards vs Registered Letters

TCP is a registered letter: you get confirmation of delivery, the post office retransmits if it gets lost, and letters arrive in order. UDP is a postcard: you write it, drop it in the postbox, and move on. You don't know if it arrived, you don't get a receipt, and if you send ten postcards they might arrive in any order. For a love letter you want confirmation. For a party invite where you're sending hundreds — a lost postcard doesn't matter, and the savings in overhead (no tracking, no confirmation wait) let you send far more, far faster.

⚖️

UDP vs TCP — The Full Comparison

COMPARISON
PropertyUDPTCP
ConnectionConnectionless — no setup, no teardownConnection-oriented — 3-way handshake + 4-way teardown
ReliabilityNone — send and forgetGuaranteed delivery with retransmission
OrderingNot guaranteed — app handles if neededGuaranteed in-order delivery
Message boundariesPreserved — 1 send = 1 recvStream — application must frame messages
Header overhead8 bytes20–60 bytes
LatencyMinimal — no handshake, no waitAt least 1 RTT for handshake before first data
Head-of-line blockingNo — each datagram independentYes — retransmit stalls all subsequent data
Congestion controlNone built-in — app responsibleBuilt-in — cwnd, slow start, etc.
Broadcast/MulticastSupported nativelyNot supported
Use casesDNS, DHCP, TFTP, VoIP, video streaming, gaming, NTP, QUIC, SNMP, RADIUSHTTP/HTTPS, SSH, SMTP, FTP, database connections

💡 Application-level reliability over UDP: Many protocols build their own reliability on top of UDP — QUIC (HTTP/3), DTLS (datagram TLS), game engines (custom ACK systems), and WebRTC all run over UDP but implement their own packet ordering, loss detection, and retransmission tailored to their specific needs. This gives them the best of both worlds: the low overhead and control of UDP plus the reliability features they actually need.

UDP HEADER — 8 BYTES, THE SMALLEST TRANSPORT HEADER

Row 1
Source Port
16 bits — 0 if unused
Destination Port
16 bits
Row 2
Length
16 bits — header + data
Checksum
16 bits — optional in IPv4
Data
UDP Payload
0 to 65,527 bytes (65,535 − 8 byte header)
🔍

Every Field Explained

FIELD REFERENCE

Source Port (16 bits)

Identifies the sending application's port. Optional in UDP — can be set to 0 if the sender doesn't need a reply (broadcast announcements, one-way telemetry). When set, the receiver can use it to send a reply to the correct client port. Client applications use ephemeral source ports (49152–65535, assigned by the OS) just like TCP.

Destination Port (16 bits)

Identifies the target application. Standard UDP ports: 53=DNS, 67/68=DHCP, 123=NTP, 161=SNMP, 500=IKEv2 (IPsec key exchange), 4500=IPsec NAT-T, 5060=SIP (VoIP), 443=QUIC (HTTP/3).

Length (16 bits)

Total length of the UDP datagram including the 8-byte header. Minimum value: 8 (header only, zero payload). Maximum: 65,535. In practice, UDP datagrams larger than ~1472 bytes (MTU 1500 minus 20 IP header minus 8 UDP header) will be fragmented by IP — generally undesirable. DNS limits responses to 512 bytes over UDP historically (EDNS0 extends this to 4096).

Checksum (16 bits)

Computed over a pseudo-header (same as TCP: IP src, IP dst, Protocol=17, UDP length) plus the entire UDP header and payload. In IPv4, the checksum is optional — a value of 0x0000 means "no checksum computed". In IPv6, it is mandatory (IPv6 has no IP header checksum, so UDP checksum is the only protection). Modern NICs offload UDP checksum computation to hardware.

/* UDP socket programming in C — minimal server */
int sock = socket(AF_INET, SOCK_DGRAM, 0);   /* SOCK_DGRAM for UDP */

struct sockaddr_in addr = {0};
addr.sin_family      = AF_INET;
addr.sin_port        = htons(53);            /* DNS port */
addr.sin_addr.s_addr = INADDR_ANY;
bind(sock, (struct sockaddr *)&addr, sizeof(addr));

/* Receive a datagram — one call = one complete message */
char buf[512];
struct sockaddr_in client;
socklen_t clen = sizeof(client);
ssize_t n = recvfrom(sock, buf, sizeof(buf), 0,
                     (struct sockaddr *)&client, &clen);
/* n = exact bytes in this datagram — complete message, no framing needed */

/* Send reply to same client */
sendto(sock, response, resp_len, 0,
       (struct sockaddr *)&client, clen);

/* Key: no connect(), no accept(), no listen() — stateless */
/* One socket can handle multiple clients simultaneously */

💡 recvfrom vs recv: UDP uses recvfrom() to get both the datagram AND the sender's address in one call. With TCP you call accept() once per connection and get a dedicated socket. With UDP a single socket handles all clients — you use the sender's address from recvfrom to send replies to the right client.

💻

UDP Header in C — Parsing Raw Packets

CODE
#include <netinet/udp.h>  /* struct udphdr */

/* Parse UDP header from raw packet bytes */
void parse_udp(const uint8_t *ip_payload, uint16_t ip_payload_len) {
    const struct udphdr *udp = (const struct udphdr *)ip_payload;

    uint16_t src_port = ntohs(udp->uh_sport);   /* or source */
    uint16_t dst_port = ntohs(udp->uh_dport);   /* or dest */
    uint16_t length   = ntohs(udp->uh_ulen);    /* total datagram length */
    uint16_t checksum = ntohs(udp->uh_sum);     /* 0 = disabled */

    uint16_t data_len = length - 8;             /* subtract header */
    const uint8_t *payload = ip_payload + 8;    /* payload after 8-byte header */

    printf("UDP: %u → %u  len=%u  cksum=0x%04x\n",
           src_port, dst_port, length, checksum);

    /* Dispatch to upper-layer handlers */
    switch (dst_port) {
        case 53:  handle_dns(payload, data_len);  break;
        case 67:  handle_dhcp(payload, data_len); break;
        case 123: handle_ntp(payload, data_len);  break;
        default:  handle_unknown(payload, data_len);
    }
}

/* In VPP: UDP header accessed via vlib_buffer_get_current() */
/* after ip4-input has advanced past IP header */
udp_header_t *udp = vlib_buffer_get_current(b0);
u16 dst_port = clib_net_to_host_u16(udp->dst_port);

KEY UDP-BASED PROTOCOLS — WHY EACH CHOSE UDP

DNS
UDP 53
Query-response fits in one datagram. No handshake = lower latency. Falls back to TCP for responses >512B (EDNS0 extends to 4096B over UDP). TCP also used for zone transfers.
DHCP
UDP 67/68
Client has no IP yet — can't use TCP. Client sends on 68, server listens on 67. Uses broadcast (255.255.255.255) to reach server before IP assignment.
NTP
UDP 123
Time sync needs a single packet exchange. TCP overhead and retransmit delays would corrupt the precision timing calculation. Each packet carries a timestamp.
SNMP
UDP 161/162
Simple polling protocol. Manager queries agent (161), agent sends traps to manager (162). Low overhead for network monitoring. SNMP v3 adds encryption over UDP.
TFTP
UDP 69
Trivial FTP — used in PXE boot and router firmware upgrades. Deliberately simple: implements its own stop-and-wait reliability over UDP. No authentication.
SIP / VoIP
UDP 5060
Voice packets are time-sensitive. A retransmitted voice packet arrives too late to be useful — better to drop and let the codec handle it. Real-time media uses RTP over UDP.
QUIC / HTTP/3
UDP 443
QUIC implements its own reliability, ordering, and congestion control over UDP — getting all of TCP's features without TCP's head-of-line blocking. The future of web transport.
IKEv2 / IPsec
UDP 500 / 4500
IPsec key exchange (IKEv2) runs over UDP 500. When NAT is present, uses port 4500 (NAT-T — NAT Traversal). ESP traffic also gets encapsulated in UDP for NAT compatibility.
📺

Real-Time Media — Why UDP Fits Video and Audio

MEDIA STREAMING

Video and audio streaming have unique requirements that make UDP far superior to TCP:

  • Timeliness over completeness — a voice packet that arrives 300ms late is worse than a dropped packet. Modern codecs (Opus, H.264) handle packet loss with error concealment — the quality degrades gracefully. TCP's retransmission would cause a stutter heard by the user.
  • No head-of-line blocking — with TCP, if one packet is lost, all subsequent packets are held in the buffer until the missing one arrives (or is retransmitted). For video this means the entire stream freezes. With UDP each packet is independent — a loss is just a momentary artefact.
  • Sender controls pacing — video encoders produce frames at a known rate. With UDP the sender decides exactly when to send each packet, matching the media timing. TCP's window management can cause bursts and gaps.

RTP (Real-time Transport Protocol) — the standard protocol for audio/video over UDP. Adds: sequence numbers (for ordering/loss detection), timestamp (for playback synchronisation), SSRC (identifies the media source). RTCP (RTP Control Protocol) provides quality feedback — packet loss rate, jitter, round-trip delay — used to adapt codec bitrate.

/* RTP Header (12 bytes) over UDP */
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |       Sequence Number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Synchronization Source (SSRC) identifier           |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|            Media payload (audio/video encoded data)           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

/* PT = Payload Type: 0=PCMU audio, 8=PCMA, 96-127=dynamic (H.264, Opus) */
/* Stack: Ethernet → IP → UDP → RTP → H.264 video frames */

UDP IN AN NGFW — STATELESS TRACKING AND COMMON THREATS

🛡️

How NGFWs Handle UDP — Pseudo-Stateful Tracking

STATEFUL UDP

UDP has no handshake and no connection state — every datagram is independent. So how does a stateful NGFW track UDP "sessions"? By treating a group of UDP datagrams between the same 5-tuple within a timeout window as a session, even though UDP itself has no concept of sessions.

/* UDP pseudo-session in NGFW conntrack */
typedef struct {
    ip4_address_t  src_ip, dst_ip;
    uint16_t       src_port, dst_port;
    uint8_t        proto;                /* 17 = UDP */

    uint64_t       first_seen;           /* timestamp of first datagram */
    uint64_t       last_seen;            /* updated on each datagram */
    uint64_t       bytes_fwd;            /* client → server bytes */
    uint64_t       bytes_rev;            /* server → client bytes */
    uint8_t        state;                /* NEW / ESTABLISHED / TIMEOUT */
    uint32_t       timeout_sec;          /* idle timeout */
} udp_session_t;

/* UDP session lifecycle */
First datagram from client → create entry (state=NEW), apply policy
Reply datagram from server  → find entry by reversed 5-tuple, state=ESTABLISHED
No more datagrams for 30s  → sweep timer removes entry (default UDP timeout)

/* Different timeouts for different UDP protocols */
DNS:          5  seconds   /* DNS is one query + one reply */
DHCP:         30 seconds
NTP:          30 seconds
VoIP/RTP:    180 seconds   /* ongoing media stream */
Generic UDP:  30 seconds   /* catch-all default */

💡 The reply problem: When a client sends a DNS query, the NGFW sees the outbound UDP packet and creates a session entry. When the DNS server replies, the NGFW sees a datagram with reversed 5-tuple — it must allow this even though no "connection" was established. This is handled by matching the reversed 5-tuple against the existing session table entry. Without this, return traffic would be blocked.

⚠️

UDP-Based Attacks and NGFW Defences

SECURITY
AttackMechanismNGFW Defence
UDP Flood Attacker sends massive volume of UDP datagrams to random ports, saturating bandwidth and forcing server to send ICMP Port Unreachable for each Rate-limit UDP per source IP per second. Drop excessive datagrams. BPF/XDP-based ingress rate limiting
UDP Amplification (DRDoS) Attacker sends small spoofed requests to DNS/NTP/SSDP servers with victim's IP as source. Server sends large replies to victim. DNS: 28B → 3000B = 100× amplification Block spoofed source IPs (BCP38/uRPF). Rate-limit DNS response size. Disable open resolvers. Block NTP monlist command
DNS Amplification Specific case of amplification using ANY queries to open resolvers Block ANY query responses >512B. Respond with TRUNCATED flag to force TCP fallback for large answers
UDP Port Scan Attacker sends UDP datagrams to all ports; closed ports return ICMP Port Unreachable, open ports return nothing or a response Rate-limit ICMP Port Unreachable generation. Track scan patterns (many different dst ports from same src)
Fragmented UDP Attacker sends fragmented UDP to hide payload content from stateless inspection Reassemble all IP fragments before L4/L7 inspection
TFTP Abuse TFTP has no authentication — arbitrary file read/write if exposed Block UDP 69 at internet perimeter. Only allow internally for PXE boot from specific subnets

ICMP — THE NETWORK'S DIAGNOSTIC AND ERROR SYSTEM

📨

What ICMP Is — The Network's Nervous System

OVERVIEW

ICMP (Internet Control Message Protocol, RFC 792) is IP's built-in error reporting and diagnostic protocol. It travels inside IP packets (Protocol number = 1) but is not a transport layer protocol — it has no ports, no concept of connections or streams. Every network device generates and consumes ICMP messages.

ICMP enables network troubleshooting tools like ping and traceroute, and also carries critical error notifications that the network depends on for correct operation (Path MTU Discovery, Redirect, etc.).

ICMP message structure: Every ICMP message has an 8-byte fixed header:

ICMP hdr
Type
8 bits
Code
8 bits
Checksum
16 bits
Type-specific data
32 bits
Payload
Variable data (depends on Type — often includes original IP header + 8 bytes of original payload)
  • Type — identifies the ICMP message category (0=Echo Reply, 3=Unreachable, 8=Echo Request, 11=Time Exceeded, etc.)
  • Code — sub-type within the Type. Type 3 has 16 different codes (0=Net Unreach, 1=Host Unreach, 3=Port Unreach, 4=Frag Needed...)
  • Checksum — covers entire ICMP message
  • Type-specific data — varies: Echo uses Identifier+Sequence, Unreachable has unused field, Redirect has gateway address
  • Payload — error messages include the original IP header + first 8 bytes of original payload (so sender can identify which packet caused the error)
🔬

Ping — How It Works Internally

PING

Ping is the simplest network diagnostic: send an ICMP Echo Request, receive an ICMP Echo Reply, measure round-trip time. Simple — but the implementation details matter.

/* ICMP Echo Request (Type 8, Code 0) */
Type:       8
Code:       0
Checksum:   [computed]
Identifier: [process ID — matches request to reply if multiple pings running]
Sequence:   [increments with each ping — 1, 2, 3...]
Data:       [arbitrary payload — default 56 bytes on Linux = 64B ICMP total]

/* ICMP Echo Reply (Type 0, Code 0) */
Type:       0
Code:       0
Checksum:   [computed]
Identifier: [same as request]
Sequence:   [same as request]
Data:       [same bytes echoed back]

/* ping command usage */
ping -c 4 8.8.8.8           # send 4 pings
ping -s 1400 8.8.8.8        # send 1400-byte payload (test MTU)
ping -f -s 1472 8.8.8.8    # flood ping at max MTU size
ping -M do -s 1473 8.8.8.8  # force DF=1, will get "Frag needed" if MTU exceeded
ping6 2001:4860:4860::8888  # IPv6 ping (ICMPv6 Type 128/129)

/* Interpreting ping output */
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=12.4 ms
#                                         ↑ TTL at receiver (started at some value, decremented by hops)
#                                                         ↑ round-trip time in ms

/* TTL tricks */
# TTL=117 → started at 128 (Windows hop) → 11 hops away
# TTL=52  → started at 64  (Linux hop)   → 12 hops away
# TTL=245 → started at 255 (router)      → 10 hops away

ICMP TYPES — COMPLETE REFERENCE WITH CONTEXT

📋

All ICMP Types — What Each Does

REFERENCE
TypeCodeNameGenerated ByPurpose and Details
00Echo ReplyDestination host Response to Echo Request (ping). Echoes the exact Identifier, Sequence Number, and data payload. Round-trip time calculated from timestamp in data.
30Net UnreachableRouter No route to the destination network in routing table.
1Host UnreachableRouterRoute to network exists but no route to specific host. ARP for host failed.
2Protocol UnreachableDestinationHost doesn't support the L4 protocol (IP Protocol field) in the packet.
3Port UnreachableDestinationNo process is listening on the destination UDP port. Key for UDP port scanning detection.
4Fragmentation NeededRouterCritical for PMTUD. Packet too large for outgoing link and DF=1 set. Message includes the MTU of the outgoing link. Must never be filtered at NGFW.
5Source Route FailedRouterStrict source routing failed — specified route is not available.
9Dest Net Admin ProhibitedRouter/FWFirewall/ACL denied the packet to this destination network. Sent when firewall wants to inform sender.
10Dest Host Admin ProhibitedRouter/FWFirewall denied packet to specific host.
13Communication Admin ProhibitedRouter/FWGeneric "blocked by admin policy" — most common NGFW rejection response.
40Source QuenchRouter/Host Deprecated (RFC 6633). Originally used to signal congestion — "slow down". Replaced by ECN and TCP congestion control. Drop these if received.
50Redirect — NetworkRouter A better route exists for this destination network via a different gateway on the same segment.
1Redirect — HostRouterBetter route for this specific host.
2Redirect — TOS+NetworkRouterBetter route for this TOS+network combination.
3Redirect — TOS+HostRouterBetter route for this TOS+host combination.
80Echo RequestAny host The "ping" packet. Destination should respond with Type 0 Echo Reply. Contains Identifier and Sequence Number for tracking.
90Router AdvertisementRouter Part of the ICMP Router Discovery Protocol (IRDP). Router announces itself. Less common — RIP/OSPF/BGP have replaced this for most routing.
100Router SolicitationHost Host asks "any routers out there?". Triggers Router Advertisement response. Mostly replaced by DHCP for gateway discovery.
110TTL Exceeded in TransitRouter The traceroute mechanism. Router decremented TTL to 0 and discarded the packet. Returns original IP header + first 8 bytes of original payload so sender knows which packet was dropped.
1Fragment Reassembly TimeoutDestinationNot all fragments arrived before the reassembly timer expired. All collected fragments discarded.
120–2Parameter ProblemRouter/Host IP header field has an invalid value. Code 0: pointer to the offending byte. Code 1: missing required option. Code 2: bad length.
130TimestampAny host Used for clock synchronisation — requests timestamps from target. Largely replaced by NTP (UDP 123).
170Address Mask RequestHost Host asks for its subnet mask. Deprecated — use DHCP or manual configuration.
🗺️

Traceroute — Complete Internal Mechanism

TRACEROUTE

Traceroute is one of the most elegant network tools — it discovers every router hop between you and a destination using nothing but ICMP Type 11 and TTL manipulation. Understanding it deeply tells you a lot about how routing works in practice.

/* Linux traceroute algorithm (UDP mode, default) */
for ttl in 1..max_hops:
    send 3 UDP packets:
        IP: TTL = ttl, dst = target
        UDP: dst_port = 33434 + (ttl-1)*3  # incrementing port per probe
    wait for response:
        ICMP Type 11 Code 0 → TTL expired at THIS router
        ICMP Type 3 Code 3  → Port Unreachable from TARGET (destination reached)
        no reply within timeout → print "* * *"

    print: ttl, router_ip (from ICMP source), 3 RTTs

/* Windows tracert algorithm (ICMP mode) */
for ttl in 1..max_hops:
    send 3 ICMP Echo Request: TTL = ttl
    ICMP Type 11 Code 0 → intermediate router
    ICMP Type 0 → destination replied (done)

/* mtr (my traceroute) — continuous real-time version */
mtr --report --report-cycles 10 8.8.8.8

/* Interpreting traceroute anomalies */
Hop 5: * * *           # ICMP blocked or rate-limited — does NOT mean broken path
                        # subsequent hops may show fine
Hop 7: 192.168.x.x    # private address — NAT or misconfigured router
RTT spike at hop 8     # congestion at or beyond hop 8
RTT lower at hop 9     # asymmetric routing — return path is shorter
Same IP twice          # routing loop (rare with modern routing protocols)
Hop 3 → Hop 5 jump    # some hops don't respond to ICMP — skipped

💡 Why UDP for Linux traceroute? By using UDP to high port numbers (33434+), Linux traceroute gets a reliable "destination reached" signal — when the packet finally arrives at the target with a valid TTL, the host returns ICMP Port Unreachable (nobody listens on port 33434+). If ICMP Echo Requests were used, the target might silently discard them if ping is blocked — giving a false "not reached" result.

IGMP AND IP MULTICAST — ONE-TO-MANY EFFICIENT DELIVERY

📡

What Multicast Is and Why It Matters

MULTICAST CONCEPT

Multicast allows one sender to efficiently deliver to multiple receivers without sending a separate copy to each — the network itself handles replication. Compare with:

Without multicast (unicast to N receivers)

Sender sends N identical copies. Network carries N×traffic. At 1000 receivers watching a live video: 1000 separate streams. Server bandwidth: 1000 × 4 Mbps = 4 Gbps.

With multicast (one multicast group)

Sender sends 1 copy to multicast address. Routers replicate only where paths diverge. At 1000 receivers: 1 stream until last router, then per-branch copies. Server bandwidth: 1 × 4 Mbps = 4 Mbps.

IP multicast address range: 224.0.0.0/4 (Class D — first 4 bits are 1110). Routers forward multicast packets only to interfaces with interested receivers. Ethernet multicast uses a MAC prefix of 01:00:5E:xx:xx:xx (lower 23 bits of IP multicast address mapped to MAC).

Important multicast addresses:

  • 224.0.0.1 — All Hosts on this subnet (local link only)
  • 224.0.0.2 — All Routers on this subnet
  • 224.0.0.5 — OSPF All Routers
  • 224.0.0.6 — OSPF Designated Routers
  • 224.0.0.9 — RIPv2 routers
  • 224.0.0.18 — VRRP
  • 239.0.0.0/8 — Organisation-local scope (private multicast)
📋

IGMP — Internet Group Management Protocol

IGMP

IGMP (RFC 3376, version 3) is how hosts tell their local router "I want to receive traffic for multicast group 224.x.x.x". Routers use this to decide which interfaces need multicast traffic forwarded to them.

IGMP VersionKey FeatureMessage Types
IGMPv1 (RFC 1112)Basic group membership. Leave by timeout only.Membership Query, Membership Report
IGMPv2 (RFC 2236)Adds explicit Leave Group message. Faster leave processing.+ Leave Group, Group-Specific Query
IGMPv3 (RFC 3376)Source-specific multicast (SSM). Receiver can specify which sources to accept from.+ Group-and-Source-Specific Query
/* IGMP exchange — host joins multicast group */
1. Host wants to join 224.1.2.3:
   sends IGMP Membership Report → dst IP: 224.1.2.3 (the group itself)
   Router sees report → starts forwarding 224.1.2.3 to this interface

2. Router sends periodic Membership Query → dst IP: 224.0.0.1 (all hosts)
   "Who still wants which groups?"
   Hosts reply with their active groups

3. Host wants to leave:
   sends IGMP Leave Group → dst IP: 224.0.0.2 (all routers)
   Router sends Group-Specific Query to confirm no remaining members
   If no reply → stops forwarding to this interface

/* IGMP Snooping — switches track IGMP to avoid flooding */
# Without IGMP snooping: multicast = flood to all ports (like broadcast)
# With IGMP snooping: switch tracks which ports have interested hosts
#   → forwards multicast only to ports with IGMP reports
#   → dramatically reduces unnecessary traffic on switched networks

# Linux: join a multicast group from a socket
struct ip_mreq mreq;
mreq.imr_multiaddr.s_addr = inet_addr("224.1.2.3");
mreq.imr_interface.s_addr = INADDR_ANY;
setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
Source
224.1.2.3 stream
Core Router
replicates at branch
Edge Router A
2 members
Edge Router B
3 members
← no traffic here (no members)
Edge Router C
0 members

ICMP IN AN NGFW — WHAT TO ALLOW, WHAT TO BLOCK, AND WHY

🛡️

The Wrong Way: Block All ICMP

COMMON MISTAKE

Many firewall administrators, in an attempt to "harden" the network, block all ICMP traffic. This is a mistake that causes subtle, hard-to-diagnose problems:

  • Broken PMTUD — blocking ICMP Type 3 Code 4 (Fragmentation Needed) breaks Path MTU Discovery. Large TCP connections work fine for small data but silently stall when they try to send large payloads. Users see "web pages partially load" or "large file downloads hang at X%".
  • No traceroute — blocks network troubleshooting, makes diagnosing outages much harder for your team and your customers.
  • Broken IPv6 — ICMPv6 is fundamental to IPv6 operation (NDP, RA, Packet Too Big). Blocking all ICMPv6 breaks IPv6 connectivity entirely.

Correct NGFW ICMP Policy

BEST PRACTICE
ICMP Type/CodeDirectionActionReason
Type 3, Code 4 (Frag Needed)BothALWAYS ALLOWPMTUD — blocking breaks large TCP connections silently
Type 11 (TTL Exceeded)InboundALLOWReturn traffic for traceroute, debugging
Type 0 (Echo Reply)InboundAllow (stateful)Return traffic for outbound pings from internal hosts
Type 3, Code 0–3 (Unreachable)InboundAllow (stateful)Error responses for established connections
Type 8 (Echo Request)Inbound from internetBlock or rate-limitReduces attack surface, prevents network mapping. Allow from trusted sources for monitoring.
Type 5 (Redirect)Inbound from internetBLOCKICMP Redirect attacks can reroute traffic through attacker's host
Type 9 (Router Advert)Inbound from internetBLOCKRogue router advertisement attacks
Type 4 (Source Quench)BothDropDeprecated (RFC 6633) — no modern implementation uses this
All ICMP typesOutboundAllowInternal users need full diagnostic capability
Type 3, Code 3 (Port Unreachable)OutboundAllowLegitimate response to UDP packets on closed ports

⚠️ ICMP rate limiting is better than blocking. Rather than blocking ICMP Type 8 (Echo Request) entirely, rate-limit it: allow 10 pings per second from any source. This allows legitimate connectivity testing and monitoring while preventing ICMP flood attacks and network mapping. Most enterprise NGFWs implement rate limiting per source IP.

🔬

ICMP-Based Attacks

ATTACK TYPES
AttackICMP Type UsedMechanismNGFW Defence
Ping FloodType 8 (Echo Request)Attacker sends thousands of pings/second, overwhelming target's CPU and bandwidth processing repliesRate-limit ICMP per source IP. Block large ICMP payloads from internet
Smurf AttackType 8 spoofed to broadcastAttacker sends Echo Requests to broadcast address with victim's IP as source. All hosts on segment reply to victim. Amplification ×N hostsBlock directed broadcasts (RFC 2644). BCP38 anti-spoofing
Ping of DeathType 8 oversizedSends fragmented ICMP payload >65535 bytes. Reassembly overflow crashed old OSes. Mostly historical.Modern OSes immune. Still filter at NGFW for defence-in-depth
ICMP Redirect AttackType 5Forged Redirect message tricks host into routing traffic through attacker's host (man-in-the-middle)Block ICMP Type 5 from external sources
ICMP TunnellingType 8/0 (Echo)Data exfiltration by encoding payload in the "data" field of ping packets. Bypasses DNS/HTTP-based content filtersDeep inspect ICMP data field. Detect non-standard ICMP payload (e.g., non-zero data, large payloads, high frequency)
OS FingerprintingType 8 + responsesDifferent OSes have slightly different ICMP behaviours (TTL starting values, window sizes, flags in unreachable messages). Used to identify OS without connectingNormalise ICMP responses (strip OS-identifying quirks)
LAB 1

Build a UDP Echo Server and Analyse Traffic

Objective: Write a UDP echo server in C, send datagrams to it, capture the traffic, and compare the overhead profile against TCP. Understand message-boundary preservation and stateless operation.

1
Write a UDP echo server in C: create AF_INET SOCK_DGRAM socket, bind to port 9000, loop on recvfrom() and sendto() the data back. Compile and run: gcc -o udp_echo server.c && ./udp_echo.
2
Send test datagrams with netcat: echo "Hello UDP" | nc -u 127.0.0.1 9000. Send multiple: for i in $(seq 1 5); do echo "msg $i" | nc -u 127.0.0.1 9000; done. Capture with Wireshark filter: udp.port == 9000.
3
Compare overhead: Count bytes per message. A "Hello UDP" message (9 bytes) over UDP: 8B UDP + 20B IP + 14B Ethernet = 42B overhead + 9B data = 51B total. Same message over TCP would need: 3-way handshake (3 packets × ~60B each = ~180B) + data segment + FIN sequence (~240B). For a single short message, UDP is vastly more efficient.
4
Message boundary test: In your server, call recvfrom() once. Send three messages rapidly from the client. Observe that recvfrom() returns exactly one message per call — each sendto() is a distinct datagram. Compare: with TCP read(), you'd need to implement your own message framing (length prefix, newline delimiter, etc.).
5
Packet loss simulation: Add simulated packet loss with netem: sudo tc qdisc add dev lo root netem loss 30%. Run your test again. Some messages are lost — neither client nor server notices or retransmits. This is UDP's behaviour by design. Remove with: sudo tc qdisc del dev lo root.
6
Bonus — QUIC comparison: Install quiche or ngtcp2 library, or simply observe that QUIC (HTTP/3) runs on UDP 443. Use: curl --http3 https://cloudflare.com (if your curl supports HTTP/3). Capture with Wireshark — filter udp.port == 443. You'll see QUIC's own reliability and multiplexing running on top of raw UDP datagrams.
LAB 2

ICMP Deep Analysis — Ping, Traceroute, and PMTUD

Objective: Capture and fully decode ICMP messages. Understand every field in Echo Request, Time Exceeded, and Destination Unreachable. Test PMTUD with DF=1 pings. Detect ICMP tunnelling.

1
Echo Request/Reply decode: Start Wireshark with filter icmp. Run ping -c 3 8.8.8.8. For each Echo Request packet: find Type (8), Code (0), Identifier, Sequence Number, payload bytes. For each Echo Reply: verify same Identifier and Sequence. Measure RTT from Wireshark timestamps vs ping output — they should match.
2
Traceroute decode: Run sudo traceroute -n 8.8.8.8 while capturing with filter icmp or udp.port >= 33434. For each TTL-Exceeded reply: expand the ICMP payload — find the embedded original IP header and first 8 bytes of the original UDP datagram. This is how the sender knows which probe triggered the error.
3
PMTUD test: Try ping -M do -s 1473 8.8.8.8 (DF=1, payload 1473 bytes = 1501B IP packet, exceeds 1500 MTU). You should get "Frag needed" ICMP Type 3 Code 4 back from your router. Capture it. In the ICMP message, find the "Next-Hop MTU" field — it tells you the MTU of the problematic link.
4
Port Unreachable (UDP probe): Send a UDP packet to a closed port: nc -u 8.8.8.8 9999 then type anything and Enter. Capture the ICMP Type 3 Code 3 response. Expand it: find the original UDP header embedded in the ICMP payload — verify src_port, dst_port=9999.
5
ICMP tunnelling demo: Use hping3 to put arbitrary data in ICMP packets: sudo hping3 -1 --icmp-type 8 --data 64 -e "SECRET DATA" 127.0.0.1. Capture with Wireshark. In the hex dump of the ICMP payload, find your "SECRET DATA" string. This is exactly how ICMP tunnelling tools (like icmptunnel or ptunnel) exfiltrate data — the NGFW must inspect the ICMP data field.
6
Scapy ICMP crafting: from scapy.all import *; send(IP(dst="127.0.0.1")/ICMP(type=5, code=1, gw="10.0.0.254")/IP(dst="8.8.8.8")/UDP()) — this crafts an ICMP Redirect message. Observe what the Linux kernel does with it (it may update the routing cache). This is the ICMP Redirect attack vector — your NGFW should block Type 5 from external sources.

M06 MASTERY CHECKLIST

When complete: Move to M07 - DNS. DNS is one of the most important protocols for NGFW — DNS-based filtering, sinkholing, and exfiltration detection are major NGFW features. DNS runs over UDP (primarily) but uses TCP for large responses, and its query/response format is a common DPI target.

← M05 TCP 🗺️ Roadmap Next: M07 - DNS →