THE INTERNET'S PHONEBOOK — AND WHY IT'S CRITICAL FOR NGFW
What DNS Does
OVERVIEWDNS (Domain Name System) translates human-readable domain names (google.com, api.jio.com) into IP addresses that computers can route to. It is the first step in almost every network connection — before a browser can fetch a web page, before a mail client can send email, before your firewall can inspect traffic, DNS runs.
DNS is foundational to NGFW for four reasons:
- URL/domain filtering — blocking access to
malware-c2.comby refusing to resolve it (DNS sinkholing) is the cheapest and most effective way to block millions of threats - Threat intelligence correlation — DNS queries reveal which hosts are communicating with which domains, enabling detection of command-and-control (C2) beaconing, data exfiltration, and lateral movement
- DPI target — DNS is plaintext over UDP/TCP — your DPI engine must parse every DNS query and response to classify, filter, and log traffic
- Evasion vector — DNS tunnelling (encoding data in DNS queries) is a major exfiltration channel; DoH (DNS-over-HTTPS) bypasses DNS inspection entirely unless your NGFW performs TLS inspection
Before smartphones, if you wanted to call a business, you called directory enquiries and asked "What's the number for Jio Platforms in Navi Mumbai?". They looked it up and told you. You then called the number. DNS works identically: your computer asks "What's the IP for google.com?" and a DNS server looks it up and replies. The key insight is that your computer caches the answer (like writing the number down) so it doesn't have to ask again for a while — this is DNS caching with TTL. And just like a directory can have incorrect entries, or someone can give you a wrong number to trick you, DNS can be poisoned — which is what DNSSEC protects against.
DNS Hierarchy — Zones, Authoritative Servers, Resolvers
ARCHITECTUREDNS is a globally distributed, hierarchical, delegated database. No single server knows all DNS records — the information is distributed across millions of servers worldwide, each authoritative for a specific portion (zone) of the namespace.
/* DNS hierarchy */ . # Root zone — 13 root server clusters (a.root-servers.net through m.) ├── com. # TLD (Top-Level Domain) — managed by Verisign │ ├── google.com. # Second-level domain — Google's zone │ │ ├── www.google.com. # subdomain record │ │ └── mail.google.com.# subdomain record │ └── amazon.com. # Different zone, different servers ├── in. # Country-code TLD (India) │ └── jio.in. # Jio's zone under .in └── io. # Another TLD /* Three types of DNS servers */ 1. Recursive Resolver (Recursor) - Your network's DNS server (DHCP-assigned: 8.8.8.8, 1.1.1.1, or your ISP's) - Does the work: queries root → TLD → authoritative on behalf of clients - Caches results for the TTL duration 2. Authoritative Name Server - Owns the actual DNS records for a zone - Configured by the domain owner (Google manages google.com's NS) - Returns definitive answers — not forwarding, not caching 3. Root Name Servers - 13 clusters (a through m), each replicated globally via anycast - Know only which TLD servers to ask — do NOT know final answers - Queried only when no cached TLD pointer exists (rare after warmup)
DNS RESOLUTION — FROM QUERY TO IP ADDRESS, STEP BY STEP
Full Iterative Resolution Walkthrough
RESOLUTIONWhen your browser opens www.google.com, here is exactly what happens — assuming a cold cache (nothing cached):
www.google.com was recently visited and the TTL hasn't expired, use the cached IP — done, no query sent./etc/hosts first (static overrides).www.google.com is queried billions of times per day. If cached: return immediately. If not cached, proceed with iteration..com?" Root replies with NS records for the .com TLD servers (a.gtld-servers.net etc.) and their glue records (A records for those NS servers). Root servers are anycast — the resolver connects to the nearest of the 13 clusters.google.com?" TLD server replies with the NS records for google.com's authoritative servers (ns1.google.com, ns2.google.com etc.) plus their glue A records.www.google.com?" Authoritative server has the definitive answer. Returns: A record with IP address(es) and the TTL.💡 Recursive vs Iterative: The client → resolver step is recursive (client asks, resolver does all the work and returns a final answer). The resolver → root → TLD → authoritative steps are iterative (each server returns a referral, resolver must follow up). The client never talks to root or authoritative servers directly in the normal flow.
DNS Resolution in Code — getaddrinfo()
CODE#include <netdb.h> #include <sys/socket.h> #include <arpa/inet.h> /* High-level: getaddrinfo() — handles DNS + IPv4/IPv6 */ struct addrinfo hints = {0}, *res; hints.ai_family = AF_UNSPEC; /* IPv4 or IPv6 */ hints.ai_socktype = SOCK_STREAM; /* TCP */ int rc = getaddrinfo("www.google.com", "443", &hints, &res); if (rc != 0) { fprintf(stderr, "DNS error: %s\n", gai_strerror(rc)); return -1; } /* Iterate through returned addresses (may have both A and AAAA) */ for (struct addrinfo *p = res; p; p = p->ai_next) { char ipstr[INET6_ADDRSTRLEN]; void *addr; if (p->ai_family == AF_INET) { struct sockaddr_in *s = (struct sockaddr_in *)p->ai_addr; addr = &s->sin_addr; } else { struct sockaddr_in6 *s = (struct sockaddr_in6 *)p->ai_addr; addr = &s->sin6_addr; } inet_ntop(p->ai_family, addr, ipstr, sizeof(ipstr)); printf("Resolved: %s\n", ipstr); } freeaddrinfo(res); /* Low-level: res_query() for custom DNS queries */ #include <resolv.h> uint8_t answer[512]; int n = res_query("google.com", C_IN, T_MX, answer, sizeof(answer)); /* Parse answer manually using ns_initparse / ns_parserr */
DNS PACKET FORMAT — HEADER, QUESTION, AND RESOURCE RECORDS
DNS Message Structure
PACKET FORMATEvery DNS message — query or response — uses the same format. The same structure is used for both UDP (most queries) and TCP (large responses, zone transfers).
Header Flags — The Control Word
FLAGS| Bit(s) | Name | Values | Meaning |
|---|---|---|---|
| Bit 15 | QR | 0=Query, 1=Response | Is this a question or an answer? |
| Bits 14–11 | Opcode | 0=QUERY, 1=IQUERY, 2=STATUS, 4=NOTIFY, 5=UPDATE | Type of DNS operation |
| Bit 10 | AA | 0/1 | Authoritative Answer — set if the responding server owns this zone |
| Bit 9 | TC | 0/1 | TrunCated — response was too large for UDP, retry with TCP |
| Bit 8 | RD | 0/1 | Recursion Desired — client requests recursive resolution |
| Bit 7 | RA | 0/1 | Recursion Available — server supports recursion |
| Bit 6 | Z | 0 | Reserved — must be 0 |
| Bit 5 | AD | 0/1 | Authentic Data — DNSSEC: all data is validated |
| Bit 4 | CD | 0/1 | Checking Disabled — DNSSEC: don't validate, I'll check myself |
| Bits 3–0 | RCODE | 0=NOERROR, 1=FORMERR, 2=SERVFAIL, 3=NXDOMAIN, 4=NOTIMP, 5=REFUSED | Response code — 3 (NXDOMAIN) = domain doesn't exist |
Label Format — How Domain Names Are Encoded
DNS doesn't send domain names as plain ASCII strings. It uses a length-prefixed label encoding where each label (component between dots) is preceded by its length byte, and the sequence ends with a zero byte:
/* Wire format of "www.google.com" in DNS */ \x03 w w w # length=3, then "www" \x06 g o o g l e # length=6, then "google" \x03 c o m # length=3, then "com" \x00 # null terminator = root Total: 1+3 + 1+6 + 1+3 + 1 = 16 bytes /* DNS compression — avoid repeating names */ /* A pointer (2 bytes starting with bits 11) points to a prior occurrence */ \xc0 \x0c # 0xC0 = 11000000 (pointer marker), 0x0c = offset 12 in message /* "The name at offset 12 in this message" */ /* Greatly reduces packet size when multiple RRs share domain names */ /* Parse domain name in C */ int parse_name(const uint8_t *msg, int msg_len, int offset, char *out) { int out_pos = 0; while (offset < msg_len && msg[offset] != 0) { uint8_t len = msg[offset++]; if ((len & 0xC0) == 0xC0) { /* pointer */ uint16_t ptr = ((len & 0x3F) << 8) | msg[offset]; offset = ptr; continue; } if (out_pos > 0) out[out_pos++] = '.'; memcpy(out + out_pos, msg + offset, len); out_pos += len; offset += len; } out[out_pos] = '\0'; return offset + 1; }
UDP vs TCP for DNS
TRANSPORTDNS uses both UDP and TCP on port 53, with specific rules governing when each is used:
- UDP (most queries) — All normal queries and responses ≤512 bytes (traditional) or ≤4096 bytes with EDNS0. Preferred for speed — single round-trip, no connection setup
- EDNS0 (Extension Mechanisms for DNS) — RFC 6891 extends the max UDP payload to 4096 bytes via an OPT pseudo-RR in the additional section. Enables DNSSEC responses (which are large), DNS cookies, and other extensions
- TC flag = 1 → retry with TCP — If a response is larger than the advertised UDP buffer, the server sets TC=1 in the truncated response. The client must re-send the query over TCP to get the full answer
- TCP only — Zone transfers (AXFR/IXFR) always use TCP. Responses reliably over 65KB use TCP. DNS over TLS (DoT) is always TCP 853
# Observe DNS in action tcpdump -i eth0 -n 'port 53' -v # capture all DNS, verbose tcpdump -i eth0 -n 'port 53 and tcp' # only TCP DNS (large responses) # Query DNS manually dig www.google.com # A query, default resolver dig @8.8.8.8 www.google.com A # specify resolver and type dig @8.8.8.8 google.com MX # MX record dig +tcp @8.8.8.8 google.com DNSKEY # force TCP dig +dnssec @8.8.8.8 google.com # request DNSSEC dig -x 8.8.8.8 # reverse DNS (PTR lookup) nslookup -type=NS google.com # NS records
DNS RECORD TYPES — THE COMPLETE REFERENCE
www.google.com → 142.250.x.x. RDATA = 4 bytes (IPv4 address).www.google.com → 2a00:1450::/32 prefix. RDATA = 16 bytes (IPv6 address). "Quad-A" record.www.jio.com → jio.com. RDATA = another domain name. Resolver must follow the chain until it hits an A/AAAA.google.com NS ns1.google.com. Essential for the delegation chain.google.com MX 10 smtp.google.com.in-addr.arpa zone. 34.216.184.93.in-addr.arpa → ec2-34-216-184-93.compute-1.amazonaws.com._sip._tcp.example.com SRV 10 5 5060 sip.example.com. Used by SIP, XMPP, Minecraft, Kubernetes.Email DNS Records — SPF, DKIM, DMARC
EMAIL SECURITYEmail authentication relies entirely on DNS TXT records. These are critical for NGFW email inspection and anti-phishing detection:
/* SPF — Sender Policy Framework (RFC 7208) */ /* TXT record listing authorised mail servers for a domain */ google.com. TXT "v=spf1 include:_spf.google.com ~all" # ~all = softfail (mark but deliver), -all = fail (reject), +all = pass all # NGFW checks: does sending server's IP match SPF? If not → suspicious /* DKIM — DomainKeys Identified Mail (RFC 6376) */ /* TXT record holding public key for email signature verification */ google._domainkey.google.com. TXT "v=DKIM1; k=rsa; p=MIIBIjANBgkqh..." # Sending server signs email header/body with private key # Receiver verifies signature using public key from DNS # NGFW can verify DKIM signatures on inbound email /* DMARC — Domain-based Message Authentication, Reporting and Conformance */ _dmarc.google.com. TXT "v=DMARC1; p=reject; rua=mailto:dmarc@google.com" # p=none/quarantine/reject — what to do with SPF/DKIM failures # rua = aggregate report destination # NGFW enforces DMARC policy for inbound email from external domains
DNS CACHING, TTL, AND NEGATIVE CACHING
How DNS Caching Works
CACHINGDNS caching is what makes the internet fast. Without caching, every DNS query would traverse the full resolution chain — root → TLD → authoritative — for every single connection. Caching stores resolved answers for their TTL (Time To Live) duration, set by the domain owner in the authoritative DNS records.
/* TTL field in DNS Resource Records */ www.google.com. 300 IN A 142.250.x.x # ↑ TTL in seconds — cached for 300s (5 minutes) /* When a recursive resolver returns a cached answer */ Original TTL: 300 seconds Query at T=0: resolver caches, returns TTL=300 Query at T=60: resolver returns from cache, TTL=240 (remaining) Query at T=300: cache expired, resolver re-queries authoritative /* TTL strategy tradeoffs */ Low TTL (60–300s): fast failover on IP change, but more DNS queries High TTL (3600–86400s): fewer queries, but changes take longer to propagate TTL=0: no caching — every query goes to authoritative (rare, special cases) /* Checking cached DNS on Linux */ resolvectl query www.google.com # shows TTL remaining systemd-resolve --statistics # cache hit/miss stats /* Flush DNS cache */ sudo resolvectl flush-caches # systemd-resolved sudo systemctl restart nscd # nscd ipconfig /flushdns # Windows
Negative Caching — Caching NXDOMAIN
NEGATIVE CACHEDNS also caches negative responses — when a domain doesn't exist (NXDOMAIN) or has no records of the requested type (NOERROR with empty answer). This prevents repeated queries for non-existent domains.
/* Negative caching (RFC 2308) */ Query: www.doesnotexist.example.com A Response: RCODE=3 (NXDOMAIN) Negative TTL: taken from SOA minimum field (often 300–3600 seconds) Resolver caches: "www.doesnotexist.example.com A → NXDOMAIN" for TTL seconds /* Why this matters for NGFW */ # Malware C2 domains often use DGA (Domain Generation Algorithms) # Generates thousands of random domains per day # Only the attacker's active C2 domain resolves — rest return NXDOMAIN # Unusually high NXDOMAIN rate from a host = potential DGA/malware indicator /* NGFW DNS analytics: track per-client NXDOMAIN rate */ # Normal: 0–5% NXDOMAIN rate # Suspicious: >20% NXDOMAIN from single host in 1 minute # DGA malware: hundreds of NXDOMAIN per minute, all random names
💡 DNS prefetching: Modern browsers and resolvers (8.8.8.8, 1.1.1.1) prefetch DNS records before TTL expires to avoid cache misses for popular domains. The resolver re-queries the authoritative server just before expiry and refreshes the cache. This gives popular domains effectively zero DNS latency despite low TTLs.
DNSSEC — CRYPTOGRAPHIC AUTHENTICATION OF DNS RESPONSES
Why DNSSEC Exists — The Cache Poisoning Problem
MOTIVATIONClassic DNS has no authentication — a resolver has no way to verify that a response is genuine and not forged. The Kaminsky Attack (2008) demonstrated that an attacker could poison a recursive resolver's cache with forged responses in minutes, redirecting millions of users to attacker-controlled servers.
The attack works because DNS uses UDP with a 16-bit Transaction ID — only 65,536 possible values. An attacker sends thousands of forged responses with random Transaction IDs, hoping to match before the legitimate response arrives. With the Kaminsky trick (forging the NS record for the whole domain, not just one hostname), this becomes devastating.
DNSSEC (DNS Security Extensions, RFC 4033–4035) adds digital signatures to DNS records. The resolver can cryptographically verify that a response came from the legitimate authoritative server and hasn't been tampered with.
DNSSEC Chain of Trust
MECHANISM/* DNSSEC chain of trust — from root down to leaf record */ Root Zone (.) DNSKEY (KSK) → signed by root's private key (the "trust anchor") DNSKEY (ZSK) → signs all records in root zone DS record for .com → hash of .com's KSK, signed by root ZSK .com TLD Zone DNSKEY (KSK) → matches hash in root's DS record DNSKEY (ZSK) → signs all records in .com zone DS record for google.com → hash of google.com's KSK, signed by .com ZSK google.com Zone DNSKEY (KSK + ZSK) → KSK matches hash in .com's DS record RRSIG (www.google.com A) → digital signature over A record, signed by ZSK A record: www.google.com → 142.250.x.x /* Resolver validation */ 1. Resolver has root trust anchor pre-configured (IANA root key) 2. Verifies .com's KSK against root's DS record 3. Verifies google.com's KSK against .com's DS record 4. Verifies www.google.com A record against google.com's RRSIG 5. If all signatures valid → AD flag set in response (Authenticated Data) 6. If any signature fails → SERVFAIL returned to client (not the forged answer) /* Check DNSSEC validation */ dig +dnssec @8.8.8.8 www.google.com A # Look for "ad" flag in flags section — means DNSSEC validated dig +dnssec @8.8.8.8 google.com DNSKEY # Shows KSK and ZSK public keys for the zone
NSEC and NSEC3 — Authenticated Denial of Existence
DNSSEC must also authenticate that a domain does NOT exist (NXDOMAIN). Without this, an attacker could suppress DNSSEC responses and substitute forged unsigned records. NSEC/NSEC3 records provide signed proof that no records exist between two names in the zone — without revealing the entire zone contents (NSEC3 uses hashed names to prevent zone enumeration).
ENCRYPTED DNS — DoH, DoT, AND DoQ
Why Encrypted DNS — The Privacy Problem with Plain DNS
MOTIVATIONPlain DNS (UDP/TCP port 53) is completely unencrypted. Every DNS query your device makes is visible to:
- Your ISP (can log, sell, censor, or inject responses)
- Any network observer on the path (coffee shop WiFi, corporate network monitoring)
- Your recursive resolver (if not 8.8.8.8/1.1.1.1, likely your ISP)
- Any on-path attacker (can perform cache poisoning even without Kaminsky)
Encrypted DNS protocols hide the query content from all on-path observers except the recursive resolver you've chosen to trust.
Three Encrypted DNS Protocols Compared
COMPARISON| Protocol | Port | Transport | Introduced | Privacy | NGFW Challenge |
|---|---|---|---|---|---|
| DoT — DNS over TLS (RFC 7858) | TCP 853 | TLS 1.2/1.3 over TCP | 2016 | Hides query content, server can verify client cert (mTLS) | Distinct port 853 — easy to block or intercept at NGFW. TLS inspection needed to see queries. |
| DoH — DNS over HTTPS (RFC 8484) | TCP 443 | HTTPS (TLS+HTTP/2) | 2018 | Hides query in HTTPS traffic — looks like web browsing | Same port as HTTPS — cannot block without blocking all HTTPS. Requires TLS inspection to detect. Firefox/Chrome bypass system resolver with DoH by default. |
| DoQ — DNS over QUIC (RFC 9250) | UDP 853 | QUIC over UDP | 2022 | QUIC encryption hides DNS, lower latency than DoT | Newest — detection requires QUIC DPI. Blocked by blocking UDP 853. |
NGFW Implications of DoH — The Inspection Bypass Problem
/* The DoH bypass problem */ Traditional NGFW: Client → DNS query UDP 53 → NGFW intercepts/logs → Resolver NGFW sees: "who is resolving malware-c2.com?" → BLOCK + ALERT With DoH (Firefox/Chrome built-in): Client → HTTPS to 1.1.1.1:443 → NGFW sees encrypted HTTPS → Resolver NGFW sees: "HTTPS traffic to 1.1.1.1" — cannot inspect query! malware-c2.com resolves successfully, client connects /* NGFW strategies to regain DNS visibility */ Strategy 1: Block known DoH resolvers by IP Block 1.1.1.1 (Cloudflare), 8.8.8.8 (Google), 9.9.9.9 (Quad9) to port 443 Force clients to use internal resolver (DNS policy in DHCP) Limitation: new DoH resolvers added constantly, list grows Strategy 2: TLS inspection (SSL inspection) NGFW acts as MITM for all HTTPS connections Decrypts → inspects DNS-over-HTTPS → re-encrypts Limitation: requires deploying custom CA cert to all clients Limitation: many apps use certificate pinning (defeats MITM) Strategy 3: Split-horizon DNS Internal DNS resolver configured to intercept all DNS queries Forward to DoH upstream, inspect responses before returning Clients only use internal resolver (enforced by firewall rule) Strategy 4: Application-layer control Group Policy (Windows) / MDM (mobile) to disable browser DoH Chrome: CHROME_DNS_OVER_HTTPS=off, Firefox: network.trr.mode=0 # Detect DoH in traffic — Wireshark filter # HTTP/2 POST to /dns-query path = DoH http2.headers.path == "/dns-query"
DNS ATTACKS — FROM CACHE POISONING TO DNS TUNNELLING
DNS Attack Taxonomy
ATTACKS| Attack | Mechanism | Impact | Defence |
|---|---|---|---|
| Cache Poisoning (Kaminsky) | Flood resolver with forged responses to win the Transaction ID race. Poison NS delegation for entire domain. | Redirect all users of poisoned resolver to attacker-controlled IPs for any domain | DNSSEC validation, source port randomisation (0.0.0.0:random), DNS cookies (RFC 7873) |
| DNS Amplification DDoS | Send ANY/DNSKEY queries to open resolvers with victim's IP spoofed. DNS replies (1000–4000 bytes) hit victim. 100× amplification. | Massive inbound traffic overwhelms victim | BCP38 anti-spoofing, disable open recursive resolvers, Response Rate Limiting (RRL) |
| DNS Tunnelling | Encode data in DNS query names: aGVsbG8K.tunnel.attacker.com A?. Response carries encoded data. Full TCP session over DNS. |
Data exfiltration bypassing HTTP/HTTPS filters, C2 communication | Deep inspect DNS payload: high entropy names, unusually long labels, high query frequency, non-existent base domains |
| DNS Hijacking | Attacker compromises router/resolver to return false answers. ISPs sometimes redirect NXDOMAIN to ad pages. | Traffic redirection, phishing, ad injection | DNSSEC, DoH to trusted resolver, monitor resolver answers for deviations |
| NXDOMAIN Attack | Flood resolver with queries for non-existent subdomains of a legitimate domain. Forces resolver to query authoritative for every miss. | Overwhelm authoritative server, degrade DNS performance for the targeted domain | NXDOMAIN rate limiting, negative caching, RPZ (Response Policy Zones) |
| DGA (Domain Generation Algorithm) | Malware generates hundreds of random domain names daily, queries them all. Only the attacker-registered one resolves — rest are NXDOMAIN. | C2 communication that's very hard to block (infinite domain supply) | ML-based DGA detection (high entropy names, consistent patterns), track NXDOMAIN rate per host |
| Subdomain Takeover | CNAME points to a cloud service (GitHub Pages, Heroku) that the owner has abandoned. Attacker claims the service, serving content on the legitimate subdomain. | Phishing, credential harvesting, cookie theft under legitimate domain | Audit all CNAME records, verify targets still active, certificate transparency monitoring |
DNS Tunnelling — Deep Dive
DNS TUNNELLINGDNS tunnelling is one of the most common data exfiltration techniques because DNS is almost never blocked at firewalls — it's essential for all network connectivity. Tools like iodine, dnscat2, and dns2tcp implement full bidirectional TCP-over-DNS tunnels.
/* How DNS tunnelling works */ Attacker controls: tunnel.attacker.com NS ns.attacker.com ns.attacker.com A → attacker's server Client wants to exfiltrate: "secret data" Encode "secret data" as base32/base64: "ONQW2YLHEBQW4" Query: ONQW2YLHEBQW4.tunnel.attacker.com A DNS recursive resolver → attacker's NS server Attacker's NS server: "resolves" the query (reads the encoded data) Response: 127.0.0.1 (or any IP — carries response data in AAAA/TXT/CNAME) Client reads response: TXT record → encoded response data from attacker Bidirectional channel established! /* Detection signatures in DNS traffic */ 1. Label entropy: Normal: www.google.com (low entropy, readable words) Tunnelling: xK2mNpQr8vBz.tunnel.c2 (high entropy, random-looking) 2. Label length: Normal: max 5-15 chars per label Tunnelling: 30-63 chars per label (max allowed by DNS) 3. Query frequency: Normal: 1-10 DNS queries/minute to a domain Tunnelling: 100-1000 queries/minute to same base domain 4. Query uniqueness: Normal: mostly same hostnames repeated (cached) Tunnelling: every query to tunnel.c2.com has a UNIQUE subdomain 5. Response size: Normal: A record = 4 bytes, AAAA = 16 bytes Tunnelling: TXT record with 200+ bytes of encoded data /* NGFW detection rule (pseudo-code) */ if (dns_label_entropy > 3.5 AND subdomain_length > 30 AND query_rate_per_domain > 50/min): ALERT "Possible DNS tunnelling from " + client_ip
NGFW DNS FEATURES — FILTERING, SINKHOLING, AND THREAT INTELLIGENCE
DNS Sinkholing — The Most Effective NGFW DNS Feature
SINKHOLINGDNS sinkholing redirects DNS queries for known-malicious domains to a "sinkhole" IP — either a local server that logs the connection attempt, or 0.0.0.0 (drops silently). This is the single most cost-effective threat blocking technique: one DNS record blocks an entire attack infrastructure, stopping malware C2, phishing, and malware distribution sites before any TCP connection is made.
/* DNS sinkhole architecture */ Normal: Client → DNS: "what is malware-c2.com?" Resolver → Authoritative: real answer → 185.x.x.x (C2 server) Client → TCP connection to 185.x.x.x → malware beacons home With sinkhole: Client → DNS: "what is malware-c2.com?" NGFW intercepts query (transparent DNS proxy on UDP 53) NGFW checks: malware-c2.com is in threat feed → BLOCK NGFW returns: NXDOMAIN (or sinkhole IP 10.0.0.254) Client: can't resolve domain → malware can't phone home NGFW logs: "host 10.0.0.5 queried known-malicious domain malware-c2.com" /* Implementation approaches */ 1. Transparent DNS proxy (most common) NGFW intercepts all UDP/TCP port 53 traffic Checks query against threat feed (bihash lookup) Modifies response or drops query 2. RPZ (Response Policy Zones) — RFC 8020, BIND/Unbound feature Operator configures "fake" DNS zone with override records zone "rpz.local" { type master; ... } Any query matching RPZ zone gets overridden response 3. DNS Firewall (inline) Full NGFW DNS proxy — receives queries, applies policy, forwards to upstream Can apply category filtering (block all "gambling", "adult content" domains) Can enforce SafeSearch DNS (redirect Google/YouTube to safe variants) /* Threat intelligence feeds for DNS */ Malware domains: abuse.ch URLhaus, Malware Domain List C2 infrastructure: Emerging Threats, Talos Phishing: PhishTank, OpenPhish, APWG Botnet C2: Bambenek Consulting, Feodo Tracker Ad/tracking: Pi-hole blocklists, AdGuard DNS Filter
DNS Analytics for Threat Detection
ANALYTICSLogging all DNS queries produces a rich dataset for threat hunting. Key analytics to run:
| Metric | Normal Baseline | Anomaly Threshold | Likely Cause |
|---|---|---|---|
| NXDOMAIN rate per host | <5% | >20% sustained | DGA malware, port scanning via DNS |
| Queries to single domain per min | 1–5 | >50/min | DNS tunnelling, beaconing |
| Unique subdomains per base domain | 1–20 known subdomains | >100 unique in 1hr | DNS tunnelling (each query encodes data) |
| Label entropy (Shannon) | 1.5–2.5 (readable words) | >3.5 (random chars) | DGA, DNS tunnelling |
| Long labels (>30 chars) | Rare (<1%) | >5% of queries | DNS tunnelling |
| New domains first seen | Most queries to known domains | Host querying many never-before-seen domains | Malware discovery phase, beaconing to rotating C2 |
| TXT record queries | Occasional (SPF checking) | Frequent TXT to unusual domains | DNS tunnelling (TXT carries response data) |
Implementing a DNS Proxy in C — NGFW Core
CODE/* Minimal DNS proxy skeleton — intercept, inspect, forward or block */ int dns_proxy_main() { int sock = socket(AF_INET, SOCK_DGRAM, 0); bind_to_port(sock, 53); while (1) { uint8_t buf[512]; struct sockaddr_in client; socklen_t clen = sizeof(client); ssize_t n = recvfrom(sock, buf, sizeof(buf), 0, (struct sockaddr *)&client, &clen); /* Parse DNS header */ uint16_t txid = ntohs(*(uint16_t *)buf); uint16_t flags = ntohs(*(uint16_t *)(buf + 2)); int is_query = !(flags >> 15); /* QR bit = 0 → query */ if (!is_query) continue; /* ignore responses */ /* Parse QNAME */ char domain[256]; parse_name(buf, n, 12, domain); /* question starts at offset 12 */ /* Check threat feed (bihash lookup by domain) */ if (is_malicious(domain)) { send_nxdomain(sock, buf, n, txid, &client, clen); log_blocked(client.sin_addr, domain); continue; } /* Check domain category for content filtering */ if (category_blocked(domain, get_client_policy(&client))) { send_refused(sock, buf, n, txid, &client, clen); continue; } /* Forward to upstream resolver */ forward_to_upstream(sock, buf, n, &client, clen); } } /* Send NXDOMAIN response */ void send_nxdomain(int sock, uint8_t *query, int qlen, uint16_t txid, struct sockaddr_in *client, socklen_t clen) { uint8_t resp[512]; memcpy(resp, query, qlen); /* Set QR=1 (response), RA=1, RCODE=3 (NXDOMAIN) */ *(uint16_t *)(resp + 2) = htons(0x8183); sendto(sock, resp, qlen, 0, (struct sockaddr *)client, clen); }
DNS Resolution Analysis with dig and Wireshark
Objective: Observe the full DNS resolution chain — query, response, caching, TTL. Decode DNS packets byte by byte. Compare authoritative vs cached responses.
sudo tcpdump -i eth0 -w /tmp/dns.pcap 'port 53'. In another terminal, run: dig @8.8.8.8 www.google.com A +norecurse. Note the difference between +norecurse (ask the server, don't do recursion for me) and the default. Stop capture and open in Wireshark.dig @8.8.8.8 www.google.com A twice, 30 seconds apart. Compare the TTL in the answer section — it should be lower on the second query (TTL decreased). Run a third time immediately — if the same TTL appears, it was served from Google's resolver cache.dig @a.root-servers.net google.com NS (ask root for .com NS)dig @a.gtld-servers.net google.com NS (ask .com TLD for google.com NS)dig @ns1.google.com www.google.com A (ask authoritative for A record). This is the full iterative resolution path.dig google.com MX, dig google.com TXT, dig google.com NS, dig google.com SOA, dig -x 8.8.8.8 (reverse PTR). For the TXT record: find the SPF record. For the SOA: identify the primary NS, admin email, serial number, and minimum TTL.dig +dnssec @8.8.8.8 cloudflare.com A. Look for the "ad" flag (Authenticated Data) in the response header — this means DNSSEC was validated. Also request the DNSKEY: dig +dnssec @8.8.8.8 cloudflare.com DNSKEY. Identify the KSK and ZSK (flags field: 257=KSK, 256=ZSK).Build a DNS Resolver in Python
Objective: Write a Python script that constructs a raw DNS query packet, sends it over UDP, and parses the response — without using any DNS library. This forces you to understand the wire format completely.
b'\x03www\x06google\x03com\x00'. Use struct.pack for numeric fields. Send via socket.sendto() to 8.8.8.8:53.socket.inet_ntoa(rdata). For CNAME (TYPE=5), RDATA is a domain name in label format — parse it. Print results: "www.google.com A 142.250.x.x TTL=300".dig @127.0.0.1 -p 5300 google.com and dig @127.0.0.1 -p 5300 malware.com.Detect DNS Tunnelling Patterns
Objective: Install a DNS tunnelling tool, generate tunnelled traffic, capture it, and write detection logic based on the anomaly signatures.
from scapy.all import *; [send(IP(dst="8.8.8.8")/UDP(dport=53)/DNS(rd=1,qd=DNSQR(qname=os.urandom(20).hex()+".example.com"))) for _ in range(50)]. Capture with tcpdump.import math; entropy = -sum(p*math.log2(p) for c in set(s) if (p := s.count(c)/len(s)) > 0). Print entropy for each query name. Normal names should have entropy 1.5–2.5; tunnelling typically >3.5.M07 MASTERY CHECKLIST
- Can explain DNS's 4 roles in NGFW: URL filtering, threat intel correlation, DPI target, evasion vector
- Know the three server types: Recursive Resolver, Authoritative Name Server, Root Name Server — and what each does
- Can walk through the 9-step DNS resolution process: browser cache → OS cache → recursive resolver → root → TLD → authoritative → return + cache
- Know the difference between recursive (client asks, resolver iterates) and iterative (each server gives referral) resolution
- Know the DNS message structure: Header (12B fixed) + Question + Answer RRs + Authority RRs + Additional RRs
- Know the 6 header flag fields: QR, Opcode, AA, TC, RD, RA and the 4-bit RCODE
- Know the key RCODE values: 0=NOERROR, 2=SERVFAIL, 3=NXDOMAIN, 5=REFUSED
- Understand label format encoding: each label preceded by its length byte, terminated by \x00
- Understand DNS compression: pointer bytes (0xC0 prefix) reference earlier name occurrences in the message
- Know when DNS uses UDP vs TCP: UDP for most queries, TCP when TC=1 (truncated) or for zone transfers
- Know EDNS0: extends max UDP payload to 4096 bytes via OPT pseudo-RR
- Know 12 DNS record types and their purpose: A, AAAA, CNAME, NS, MX, TXT, PTR, SOA, SRV, CAA, DNSKEY, RRSIG
- Know SPF, DKIM, DMARC — what each does and which DNS record type they use (all TXT)
- Understand TTL: set by zone owner, controls cache duration, low=fast failover, high=fewer queries
- Understand negative caching: NXDOMAIN cached for SOA minimum TTL; high NXDOMAIN rate = DGA indicator
- Know DNSSEC's purpose: cryptographic authentication against cache poisoning (Kaminsky attack)
- Understand DNSSEC chain of trust: Root DNSKEY → .com DS → google.com DNSKEY → RRSIG on A record
- Know RRSIG (signature), DS (delegation signer hash), NSEC/NSEC3 (authenticated denial) records
- Know three encrypted DNS protocols: DoT (TCP 853), DoH (TCP 443 / HTTPS), DoQ (UDP 853 / QUIC)
- Know why DoH is an NGFW challenge: same port as HTTPS, browsers bypass system resolver by default
- Know 4 NGFW strategies to handle DoH: block resolver IPs, TLS inspection, split-horizon DNS, MDM policy
- Know 7 DNS attacks: cache poisoning, amplification DDoS, tunnelling, hijacking, NXDOMAIN flood, DGA, subdomain takeover
- Know DNS tunnelling detection signatures: high label entropy, long labels, many unique subdomains per domain, high TXT query rate
- Know DNS sinkholing: intercept query to malicious domain → return NXDOMAIN → block C2/phishing before TCP connection
- Know key DNS threat intelligence feeds: abuse.ch, Malware Domain List, Feodo Tracker, PhishTank
- Know DNS analytics anomalies to alert on: NXDOMAIN rate >20%, >50 queries/min to single domain, entropy >3.5
- Completed Lab 1: walked full resolution chain manually with dig, decoded packet headers, verified DNSSEC AD flag
- Completed Lab 2: built raw DNS resolver in Python from scratch; implemented DNS sinkhole server
- Completed Lab 3: generated DNS tunnelling traffic, wrote entropy-based detection script with pcap analysis
✅ When complete: Move to M08 - HTTP/1.1, HTTP/2, HTTP/3 and QUIC. HTTP carries the majority of internet traffic — it is both the primary application protocol your NGFW must inspect and the transport layer for TLS (HTTPS). Understanding HTTP deeply is essential for URL filtering, SSL inspection, and application identification.