BGP — THE ROUTING PROTOCOL OF THE INTERNET
What Makes BGP Different
OVERVIEWBGP (Border Gateway Protocol, RFC 4271) is the routing protocol that makes the internet work. It is the only EGP (Exterior Gateway Protocol) in use today — it connects the ~75,000 Autonomous Systems (ASes) that make up the global internet. Unlike OSPF which optimises for fastest path, BGP is a policy-driven protocol: its primary goal is to express complex business routing policies, not to find the mathematically shortest path.
BGP's defining characteristics:
- Path-vector protocol — routes carry the full AS-PATH (list of ASes the prefix has traversed). Loop prevention is achieved by rejecting routes that already contain your own AS number in the path
- TCP-based — sessions run over TCP port 179. The reliability and ordering of TCP replace BGP's need for its own retransmission mechanism
- Rich attribute system — routes carry attributes (AS-PATH, NEXT-HOP, LOCAL-PREF, MED, COMMUNITY) that encode routing policy
- Scales to internet size — the global BGP table contains ~950,000 prefixes (2024) carried by route reflectors and confederation hierarchies
- Incremental updates — BGP sends only changes (UPDATE messages), not full table re-advertisements like distance-vector protocols
Autonomous Systems — The BGP Addressing Model
AS MODELThe internet is divided into Autonomous Systems (ASes) — networks under a single administrative control (an ISP, a company, a university). Each AS is assigned a unique AS Number (ASN) by a Regional Internet Registry (RIR like APNIC for Asia-Pacific). Jio Platforms' ASN is AS55836.
/* ASN ranges */ 16-bit ASNs (legacy): 1–65535 Private range: 64512–65535 (like RFC 1918 for IPs) Public range: 1–64511 32-bit ASNs (modern): 1–4294967295 Private range: 4200000000–4294967294 Public range: everything else /* AS relationships */ Transit: AS-A pays AS-B to carry traffic to/from the internet (customer → provider relationship) Peering: AS-A and AS-B exchange traffic for free (both benefit — settlement-free peering) IXP: Internet Exchange Point — physical location where many ASes peer simultaneously (AMS-IX, DE-CIX, NIXI) /* Look up any ASN */ whois -h whois.radb.net AS55836 bgp.he.net/AS55836 # HE BGP toolkit
eBGP vs iBGP — EXTERNAL AND INTERNAL BGP
The Critical Difference Between eBGP and iBGP
COMPARISON| Property | eBGP (External) | iBGP (Internal) |
|---|---|---|
| Between | Routers in different ASes | Routers in the same AS |
| Default TTL | 1 (must be directly connected) | 255 (can be multiple hops away) |
| AS-PATH handling | Prepends own AS number to AS-PATH | Does NOT modify AS-PATH |
| NEXT-HOP handling | Sets NEXT-HOP to own IP address | Does NOT change NEXT-HOP (leaves as eBGP learned next-hop) |
| Route propagation rule | Routes can be sent to any eBGP peer | iBGP split-horizon: routes learned from iBGP peer NOT re-advertised to another iBGP peer |
| Full mesh requirement | No — each AS has its own eBGP peers | Yes — requires full mesh OR Route Reflectors OR Confederation |
| Administrative Distance | 20 (preferred over IGP routes) | 200 (least preferred) |
| LOCAL-PREF | Not sent between ASes | Shared between all iBGP peers in AS |
💡 iBGP split-horizon is the key scaling challenge. Because iBGP routes can't be re-advertised between iBGP peers, every router must have a direct iBGP session with every other router — O(N²) sessions. With 100 BGP routers in an AS: 4950 sessions. Solutions: Route Reflectors (a designated RR re-advertises iBGP routes to all clients) or Confederation (divide the AS into sub-ASes with eBGP between them).
Route Reflectors — Solving the Full-Mesh Problem
ROUTE REFLECTORS/* Without Route Reflector (full mesh required) */ R1 ←→ R2, R1 ←→ R3, R1 ←→ R4 R2 ←→ R3, R2 ←→ R4 R3 ←→ R4 Total: N(N-1)/2 = 6 sessions for 4 routers /* With Route Reflector RR */ R1 (RR) ←→ R2 (client) R1 (RR) ←→ R3 (client) R1 (RR) ←→ R4 (client) Total: N-1 = 3 sessions! RR re-advertises routes received from: - iBGP client → to ALL other iBGP clients and eBGP peers - eBGP peer → to ALL iBGP clients - Non-client iBGP → to clients only (NOT to other non-clients) /* RR adds ORIGINATOR-ID and CLUSTER-LIST attributes to prevent loops */ ORIGINATOR-ID: Router-ID of the original route source CLUSTER-LIST: List of route reflector clusters the route passed through If router receives a route with its own Router-ID in ORIGINATOR-ID → discard /* FRR BGP Route Reflector config */ router bgp 65001 neighbor 10.0.0.2 remote-as 65001 neighbor 10.0.0.2 route-reflector-client # make this peer a RR client
BGP SESSION ESTABLISHMENT AND MESSAGE TYPES
BGP Session Establishment
SESSION/* BGP uses TCP port 179 — sessions are manually configured */ /* Unlike OSPF (auto-discovers neighbours), BGP peers must be explicitly configured */ /* BGP FSM (Finite State Machine) */ Idle → (start) → Connect Connect → TCP connection attempt → (success) OpenSent / (fail) Active Active → retry TCP connection OpenSent → TCP connected, OPEN sent, waiting for peer's OPEN OpenConfirm → Both OPENs received, waiting for KEEPALIVE Established → Session up! Exchanging routes via UPDATE messages /* BGP OPEN message fields */ Version: 4 (BGPv4) My AS: local AS number Hold Time: max seconds between messages (negotiate min of peers' values) BGP Identifier: router-id (32-bit) Optional Params: capabilities (4-octet ASN, route-refresh, multiprotocol) /* BGP Message Types */ OPEN: Session establishment — exchange capabilities UPDATE: Route advertisements and withdrawals KEEPALIVE: Heartbeat — prevents Hold Timer expiry (default every HoldTime/3) NOTIFICATION:Error notification — followed by TCP teardown ROUTE-REFRESH: Request peer to re-send full routing table (RFC 2918) /* BGP timers */ Connect Retry: 120s (retry TCP connect after failure) Hold Timer: 90s default (reset on any BGP message) Keepalive: HoldTime/3 = 30s default /* FRR BGP basic config */ router bgp 65001 bgp router-id 1.1.1.1 neighbor 203.0.113.1 remote-as 65002 # eBGP peer neighbor 10.0.0.2 remote-as 65001 # iBGP peer neighbor 10.0.0.2 update-source lo # use loopback for iBGP ! address-family ipv4 unicast network 192.0.2.0/24 # advertise this prefix neighbor 203.0.113.1 activate neighbor 10.0.0.2 activate neighbor 10.0.0.2 next-hop-self # fix iBGP next-hop issue exit-address-family
BGP PATH ATTRIBUTES — THE ROUTING POLICY TOOLKIT
BGP Path Attributes Reference
ATTRIBUTES| Attribute | Type | Values | Used For |
|---|---|---|---|
| ORIGIN | Well-known mandatory | IGP(i), EGP(e), Incomplete(?) | How the prefix entered BGP. IGP = best, Incomplete = redistributed from IGP or static |
| AS-PATH | Well-known mandatory | Sequence of AS numbers: [65001, 65002, 65003] | Loop prevention + path selection (shorter = preferred) + policy matching |
| NEXT-HOP | Well-known mandatory | IP address of next-hop router | Tells receiver which router to send traffic to. Key iBGP problem: may not be reachable without IGP. |
| LOCAL-PREF | Well-known discretionary | 0–4294967295 (default 100) | Prefer exit point within an AS. Higher = preferred. NOT sent to eBGP peers. |
| MED | Optional non-transitive | 0–4294967295 (default 0) | Multi-Exit Discriminator — hint to neighbor AS about preferred entry point. Lower = preferred. Compared only between routes from same AS. |
| COMMUNITY | Optional transitive | 32-bit: ASN:value (e.g., 65001:100) | Tag routes for policy matching. Common: no-export(65535:65281), no-advertise(65535:65282), blackhole(65535:666) |
| ATOMIC-AGGREGATE | Well-known discretionary | Flag (present/absent) | Indicates the route is an aggregate and specific routes exist that were lost during aggregation |
| AGGREGATOR | Optional transitive | ASN + IP | Identifies which router created an aggregate route |
| ORIGINATOR-ID | Optional non-transitive | Router-ID (32-bit) | Route Reflector: identifies original route source for loop detection |
| CLUSTER-LIST | Optional non-transitive | List of cluster IDs | Route Reflector: prevents loops between route reflectors |
BGP BEST-PATH SELECTION — THE 13-STEP ALGORITHM
BGP Best-Path Decision Process
BEST PATHWhen BGP receives multiple paths to the same prefix, it selects one "best path" to install in the FIB and advertise to peers. The selection follows a strict ordered list of criteria — evaluated in sequence, stopping at the first differentiating criterion.
/* BGP best-path selection — in order (Cisco/FRR) */ /* Mnemonic: "We Love Oranges As Oranges Mean Pure Refreshment" */ 1. Weight (Cisco proprietary) — higher preferred. Local to router. 2. LOCAL-PREF Higher preferred. Shared within AS. 3. Locally Originated Routes originated by this router preferred. 4. AS-PATH length Shorter (fewer hops) preferred. 5. ORIGIN code IGP(i) < EGP(e) < Incomplete(?) 6. MED Lower preferred (only compared within same AS). 7. eBGP over iBGP eBGP-learned routes preferred over iBGP. 8. IGP metric to NEXT-HOP — lower preferred (closest exit). 9. Oldest eBGP path If all equal so far — oldest (most stable) preferred. 10. Lowest Router-ID of advertising router. 11. Shortest CLUSTER-LIST length (Route Reflector environments). 12. Lowest neighbour IP address (tie-break). /* Verify best path selection */ show ip bgp 10.0.0.0/8 # show all paths, best marked with ">" show ip bgp 10.0.0.0/8 bestpath # show why this path was chosen /* Policy knobs to influence best-path */ LOCAL-PREF: control which exit from your AS preferred (inbound traffic) AS-PATH prepend: make your AS look farther away (discourage inbound traffic on a path) MED: influence which of your routers a neighbour enters through Communities: tag routes and have neighbours apply policy based on tags
💡 The most important attributes for enterprise policy: LOCAL-PREF controls outbound traffic (which exit path your AS uses for a destination). AS-PATH prepending controls inbound traffic (which path remote ASes use to reach you). MED provides a hint to directly connected neighbours about preferred entry points but is often ignored or overridden.
BGP ROUTE POLICY — FILTERING AND MANIPULATION
Route Maps and Filtering Tools
POLICY/* BGP filtering tools */ 1. Prefix lists — match by prefix/length ip prefix-list BLOCK-DEFAULT seq 5 deny 0.0.0.0/0 ip prefix-list ALLOW-ALL seq 10 permit 0.0.0.0/0 le 32 2. AS-PATH access lists — match by regex on AS-PATH ip as-path access-list 1 permit ^65002$ # only AS 65002 ip as-path access-list 2 permit ^65002_ # originated by 65002 ip as-path access-list 3 deny .* # deny all 3. Community lists — match by community value ip community-list 1 permit 65001:100 4. Route maps — combine match + set operations route-map POLICY permit 10 match ip address prefix-list MY-PREFIXES set local-preference 200 set community 65001:100 additive route-map POLICY deny 20 # deny everything else /* Apply to BGP peer */ router bgp 65001 neighbor 203.0.113.1 route-map POLICY in # filter incoming updates neighbor 203.0.113.1 route-map POLICY out # filter outgoing updates /* AS-PATH prepending — make path look longer to discourage use */ route-map SET-PREPEND permit 10 set as-path prepend 65001 65001 65001 # prepend own AS 3 times # Result: route appears 3 hops further away on this path /* Communities for ISP signaling */ # Send community 65002:100 to ISP → they set your LOCAL-PREF to 100 (low) # Send community 65002:200 → they set LOCAL-PREF to 200 (high = prefer this path) # BGP Blackhole community (RFC 7999): 65535:666 # Most ISPs: if you advertise a /32 with 65535:666, they null-route it → DDoS mitigation
BGP SECURITY — ROUTE HIJACKING, RPKI, AND PROTECTION
BGP Attacks — Route Hijacking and Leaks
SECURITY| Attack | How It Happens | Impact | Example |
|---|---|---|---|
| Prefix Hijacking | AS announces a prefix it doesn't own. BGP prefers more-specific prefixes — attacker announces /24 of a /16 they don't own. Their announcement wins globally. | Traffic for the victim prefix redirected to attacker (interception or blackhole) | Pakistan Telecom 2008 hijacked YouTube's prefixes for 2 hours |
| Route Leak | AS re-advertises routes it shouldn't — e.g., customer leaks provider's full table to another provider, causing traffic to flow through the customer (sub-optimal or broken) | Traffic disruption, possible interception | Cloudflare 2019: AS routing leak from Verizon caused widespread outage |
| BGP Session Hijacking | Attacker spoofs TCP RST to tear down a BGP session, disrupting routing updates | BGP convergence event, potential route withdrawal causing traffic drops | RFC 4953 — mitigated by MD5/TCP-AO authentication |
RPKI — Resource Public Key Infrastructure
RPKIRPKI (RFC 6480) is the cryptographic solution to BGP prefix hijacking. IP address holders (using their RIR account) create signed certificates called Route Origin Authorizations (ROAs) that state "AS X is authorised to originate prefix P/len". Routers with RPKI-enabled BGP validate incoming routes against the ROA database.
/* RPKI Route Origin Validation (ROV) */ ROA: "192.0.2.0/24 may be originated by AS64496, max-length /24" Signed by: the IP address holder's RIR certificate chain Router receives BGP update: 192.0.2.0/24 from AS64497 RPKI check: Valid: prefix+origin matches a ROA → install, prefer Invalid: prefix+origin contradicts ROA (wrong AS) → DROP (or low pref) Unknown: no ROA exists for this prefix → accept (no info) /* Validation states */ Valid: Route passes RPKI validation — safe to use Invalid: Route fails RPKI — likely hijack → should be dropped Unknown: No ROA exists — treat as before RPKI (accept, lower preference) /* FRR RPKI config */ rpki rpki cache rpki.example.com 3323 preference 1 # RTR server router bgp 65001 bgp bestpath prefix-validate allow-invalid # don't drop invalid (log only) # For production: configure route-map to drop invalid routes route-map FROM-PEER deny 5 match rpki invalid # drop RPKI-invalid routes route-map FROM-PEER permit 10 /* Check RPKI status */ show bgp ipv4 unicast 192.0.2.0/24 # shows "rpki: valid/invalid/not found"
⚠️ BGP session authentication. Always configure MD5 or TCP-AO authentication on BGP sessions to prevent session teardown via spoofed RST packets: neighbor 203.0.113.1 password strongpassword. MD5 has weaknesses but is widely deployed; TCP-AO (RFC 5925) is the modern replacement.
BGP Peering with FRR — eBGP Between Two ASes
Objective: Configure eBGP peering between two Linux VMs running FRR. Advertise prefixes, observe path attributes, and manipulate best-path selection.
neighbor 10.1.0.2 remote-as 65002; on AS65002: neighbor 10.1.0.1 remote-as 65001. Verify session reaches Established: show bgp summary.network 192.0.2.0/24 (AS65001) and network 198.51.100.0/24 (AS65002). Verify routes are received: show bgp ipv4 unicast. Examine the UPDATE: note AS-PATH, NEXT-HOP, ORIGIN attributes.BGP Route Filtering and Community Tagging
Objective: Implement prefix-list and community-based filtering. Practice the route policy tools used in production ISP and enterprise BGP configurations.
ip prefix-list ACCEPT-SPECIFICS permit 0.0.0.0/0 ge 24. Apply inbound. Verify: attempt to advertise a /22 from the peer — it should be filtered.show bgp ipv4 unicast 198.51.100.0/24 shows community value.M12 MASTERY CHECKLIST
- Know BGP is a path-vector EGP: carries full AS-PATH, policy-driven, connects ASes on the internet
- Know BGP uses TCP 179 (reliable transport, manually configured peers)
- Know Autonomous System (AS) concept, ASN ranges, private ASNs (64512–65535)
- Know eBGP vs iBGP: different ASes vs same AS; TTL 1 vs 255; AS-PATH modified vs not; NEXT-HOP modified vs not
- Know iBGP split-horizon: routes from iBGP peers NOT re-advertised to other iBGP peers
- Know why full-mesh iBGP is unscalable: N(N-1)/2 sessions
- Know Route Reflectors: one RR re-advertises iBGP routes to all clients; ORIGINATOR-ID and CLUSTER-LIST for loop prevention
- Know the 6 BGP FSM states: Idle, Connect, Active, OpenSent, OpenConfirm, Established
- Know 5 BGP message types: OPEN, UPDATE, KEEPALIVE, NOTIFICATION, ROUTE-REFRESH
- Know mandatory BGP attributes: ORIGIN (i/e/?), AS-PATH, NEXT-HOP
- Know LOCAL-PREF: higher preferred, within AS only, controls outbound traffic exit
- Know MED: lower preferred, hint to neighbor AS about preferred entry, only compared within same AS
- Know COMMUNITY: 32-bit tags, used for policy matching and signaling between ASes
- Know well-known communities: no-export(65535:65281), no-advertise(65535:65282), blackhole(65535:666)
- Can recall BGP best-path selection order: Weight → LOCAL-PREF → Locally Originated → AS-PATH length → ORIGIN → MED → eBGP over iBGP → IGP metric → Router-ID
- Know how to influence inbound traffic: LOCAL-PREF (within AS), AS-PATH prepending (to other ASes)
- Know route-map components: match conditions (prefix-list, community, as-path) + set actions (local-pref, community, prepend)
- Know BGP prefix hijacking: attacker announces specific prefix it doesn't own; more-specific wins globally
- Know RPKI and ROAs: cryptographic proof that ASN X can originate prefix P; states = Valid/Invalid/Unknown
- Know BGP authentication: MD5 or TCP-AO prevents session teardown via spoofed RST
- Completed Lab 1: configured eBGP between two FRR instances, tested LOCAL-PREF and AS-PATH prepending
- Completed Lab 2: implemented prefix-list filtering and community-based policy
✅ When complete: Move to M13 - MPLS, VxLAN, GRE and Tunneling — the final Phase 3 module covering overlay networks and tunnelling mechanisms critical to modern data centres and VPN deployments.