NETWORKING MASTERY · PHASE 3 · MODULE 12 · WEEK 10
🌍 BGP Internals
eBGP vs iBGP · Path attributes · Best-path selection · Route policy · Communities · BGP security
Intermediate → Advanced Prerequisite: M10, M11 RFC 4271 Internet Routing Protocol 2 Labs

BGP — THE ROUTING PROTOCOL OF THE INTERNET

🌍

What Makes BGP Different

OVERVIEW

BGP (Border Gateway Protocol, RFC 4271) is the routing protocol that makes the internet work. It is the only EGP (Exterior Gateway Protocol) in use today — it connects the ~75,000 Autonomous Systems (ASes) that make up the global internet. Unlike OSPF which optimises for fastest path, BGP is a policy-driven protocol: its primary goal is to express complex business routing policies, not to find the mathematically shortest path.

BGP's defining characteristics:

  • Path-vector protocol — routes carry the full AS-PATH (list of ASes the prefix has traversed). Loop prevention is achieved by rejecting routes that already contain your own AS number in the path
  • TCP-based — sessions run over TCP port 179. The reliability and ordering of TCP replace BGP's need for its own retransmission mechanism
  • Rich attribute system — routes carry attributes (AS-PATH, NEXT-HOP, LOCAL-PREF, MED, COMMUNITY) that encode routing policy
  • Scales to internet size — the global BGP table contains ~950,000 prefixes (2024) carried by route reflectors and confederation hierarchies
  • Incremental updates — BGP sends only changes (UPDATE messages), not full table re-advertisements like distance-vector protocols
🏢

Autonomous Systems — The BGP Addressing Model

AS MODEL

The internet is divided into Autonomous Systems (ASes) — networks under a single administrative control (an ISP, a company, a university). Each AS is assigned a unique AS Number (ASN) by a Regional Internet Registry (RIR like APNIC for Asia-Pacific). Jio Platforms' ASN is AS55836.

/* ASN ranges */
16-bit ASNs (legacy): 1–65535
  Private range:      64512–65535 (like RFC 1918 for IPs)
  Public range:       1–64511

32-bit ASNs (modern): 1–4294967295
  Private range:      4200000000–4294967294
  Public range:       everything else

/* AS relationships */
Transit:   AS-A pays AS-B to carry traffic to/from the internet
           (customer → provider relationship)
Peering:   AS-A and AS-B exchange traffic for free
           (both benefit — settlement-free peering)
IXP:       Internet Exchange Point — physical location where
           many ASes peer simultaneously (AMS-IX, DE-CIX, NIXI)

/* Look up any ASN */
whois -h whois.radb.net AS55836
bgp.he.net/AS55836   # HE BGP toolkit

eBGP vs iBGP — EXTERNAL AND INTERNAL BGP

⚖️

The Critical Difference Between eBGP and iBGP

COMPARISON
PropertyeBGP (External)iBGP (Internal)
BetweenRouters in different ASesRouters in the same AS
Default TTL1 (must be directly connected)255 (can be multiple hops away)
AS-PATH handlingPrepends own AS number to AS-PATHDoes NOT modify AS-PATH
NEXT-HOP handlingSets NEXT-HOP to own IP addressDoes NOT change NEXT-HOP (leaves as eBGP learned next-hop)
Route propagation ruleRoutes can be sent to any eBGP peeriBGP split-horizon: routes learned from iBGP peer NOT re-advertised to another iBGP peer
Full mesh requirementNo — each AS has its own eBGP peersYes — requires full mesh OR Route Reflectors OR Confederation
Administrative Distance20 (preferred over IGP routes)200 (least preferred)
LOCAL-PREFNot sent between ASesShared between all iBGP peers in AS

💡 iBGP split-horizon is the key scaling challenge. Because iBGP routes can't be re-advertised between iBGP peers, every router must have a direct iBGP session with every other router — O(N²) sessions. With 100 BGP routers in an AS: 4950 sessions. Solutions: Route Reflectors (a designated RR re-advertises iBGP routes to all clients) or Confederation (divide the AS into sub-ASes with eBGP between them).

🔄

Route Reflectors — Solving the Full-Mesh Problem

ROUTE REFLECTORS
/* Without Route Reflector (full mesh required) */
R1 ←→ R2, R1 ←→ R3, R1 ←→ R4
R2 ←→ R3, R2 ←→ R4
R3 ←→ R4
Total: N(N-1)/2 = 6 sessions for 4 routers

/* With Route Reflector RR */
R1 (RR) ←→ R2 (client)
R1 (RR) ←→ R3 (client)
R1 (RR) ←→ R4 (client)
Total: N-1 = 3 sessions!

RR re-advertises routes received from:
  - iBGP client → to ALL other iBGP clients and eBGP peers
  - eBGP peer   → to ALL iBGP clients
  - Non-client iBGP → to clients only (NOT to other non-clients)

/* RR adds ORIGINATOR-ID and CLUSTER-LIST attributes to prevent loops */
ORIGINATOR-ID: Router-ID of the original route source
CLUSTER-LIST:  List of route reflector clusters the route passed through
If router receives a route with its own Router-ID in ORIGINATOR-ID → discard

/* FRR BGP Route Reflector config */
router bgp 65001
  neighbor 10.0.0.2 remote-as 65001
  neighbor 10.0.0.2 route-reflector-client  # make this peer a RR client

BGP SESSION ESTABLISHMENT AND MESSAGE TYPES

🤝

BGP Session Establishment

SESSION
/* BGP uses TCP port 179 — sessions are manually configured */
/* Unlike OSPF (auto-discovers neighbours), BGP peers must be explicitly configured */

/* BGP FSM (Finite State Machine) */
Idle        → (start) → Connect
Connect     → TCP connection attempt → (success) OpenSent / (fail) Active
Active      → retry TCP connection
OpenSent    → TCP connected, OPEN sent, waiting for peer's OPEN
OpenConfirm → Both OPENs received, waiting for KEEPALIVE
Established → Session up! Exchanging routes via UPDATE messages

/* BGP OPEN message fields */
Version:        4 (BGPv4)
My AS:          local AS number
Hold Time:      max seconds between messages (negotiate min of peers' values)
BGP Identifier: router-id (32-bit)
Optional Params: capabilities (4-octet ASN, route-refresh, multiprotocol)

/* BGP Message Types */
OPEN:        Session establishment — exchange capabilities
UPDATE:      Route advertisements and withdrawals
KEEPALIVE:   Heartbeat — prevents Hold Timer expiry (default every HoldTime/3)
NOTIFICATION:Error notification — followed by TCP teardown
ROUTE-REFRESH: Request peer to re-send full routing table (RFC 2918)

/* BGP timers */
Connect Retry: 120s (retry TCP connect after failure)
Hold Timer:    90s default (reset on any BGP message)
Keepalive:     HoldTime/3 = 30s default

/* FRR BGP basic config */
router bgp 65001
  bgp router-id 1.1.1.1
  neighbor 203.0.113.1 remote-as 65002       # eBGP peer
  neighbor 10.0.0.2    remote-as 65001       # iBGP peer
  neighbor 10.0.0.2    update-source lo      # use loopback for iBGP
  !
  address-family ipv4 unicast
    network 192.0.2.0/24                      # advertise this prefix
    neighbor 203.0.113.1 activate
    neighbor 10.0.0.2 activate
    neighbor 10.0.0.2 next-hop-self          # fix iBGP next-hop issue
  exit-address-family

BGP PATH ATTRIBUTES — THE ROUTING POLICY TOOLKIT

📋

BGP Path Attributes Reference

ATTRIBUTES
AttributeTypeValuesUsed For
ORIGINWell-known mandatoryIGP(i), EGP(e), Incomplete(?)How the prefix entered BGP. IGP = best, Incomplete = redistributed from IGP or static
AS-PATHWell-known mandatorySequence of AS numbers: [65001, 65002, 65003]Loop prevention + path selection (shorter = preferred) + policy matching
NEXT-HOPWell-known mandatoryIP address of next-hop routerTells receiver which router to send traffic to. Key iBGP problem: may not be reachable without IGP.
LOCAL-PREFWell-known discretionary0–4294967295 (default 100)Prefer exit point within an AS. Higher = preferred. NOT sent to eBGP peers.
MEDOptional non-transitive0–4294967295 (default 0)Multi-Exit Discriminator — hint to neighbor AS about preferred entry point. Lower = preferred. Compared only between routes from same AS.
COMMUNITYOptional transitive32-bit: ASN:value (e.g., 65001:100)Tag routes for policy matching. Common: no-export(65535:65281), no-advertise(65535:65282), blackhole(65535:666)
ATOMIC-AGGREGATEWell-known discretionaryFlag (present/absent)Indicates the route is an aggregate and specific routes exist that were lost during aggregation
AGGREGATOROptional transitiveASN + IPIdentifies which router created an aggregate route
ORIGINATOR-IDOptional non-transitiveRouter-ID (32-bit)Route Reflector: identifies original route source for loop detection
CLUSTER-LISTOptional non-transitiveList of cluster IDsRoute Reflector: prevents loops between route reflectors

BGP BEST-PATH SELECTION — THE 13-STEP ALGORITHM

🏆

BGP Best-Path Decision Process

BEST PATH

When BGP receives multiple paths to the same prefix, it selects one "best path" to install in the FIB and advertise to peers. The selection follows a strict ordered list of criteria — evaluated in sequence, stopping at the first differentiating criterion.

/* BGP best-path selection — in order (Cisco/FRR) */
/* Mnemonic: "We Love Oranges As Oranges Mean Pure Refreshment" */

1.  Weight           (Cisco proprietary) — higher preferred. Local to router.
2.  LOCAL-PREF       Higher preferred. Shared within AS.
3.  Locally Originated  Routes originated by this router preferred.
4.  AS-PATH length   Shorter (fewer hops) preferred.
5.  ORIGIN code      IGP(i) < EGP(e) < Incomplete(?)
6.  MED              Lower preferred (only compared within same AS).
7.  eBGP over iBGP   eBGP-learned routes preferred over iBGP.
8.  IGP metric       to NEXT-HOP — lower preferred (closest exit).
9.  Oldest eBGP path If all equal so far — oldest (most stable) preferred.
10. Lowest Router-ID of advertising router.
11. Shortest CLUSTER-LIST length (Route Reflector environments).
12. Lowest neighbour IP address (tie-break).

/* Verify best path selection */
show ip bgp 10.0.0.0/8          # show all paths, best marked with ">"
show ip bgp 10.0.0.0/8 bestpath # show why this path was chosen

/* Policy knobs to influence best-path */
LOCAL-PREF: control which exit from your AS preferred (inbound traffic)
AS-PATH prepend: make your AS look farther away (discourage inbound traffic on a path)
MED: influence which of your routers a neighbour enters through
Communities: tag routes and have neighbours apply policy based on tags

💡 The most important attributes for enterprise policy: LOCAL-PREF controls outbound traffic (which exit path your AS uses for a destination). AS-PATH prepending controls inbound traffic (which path remote ASes use to reach you). MED provides a hint to directly connected neighbours about preferred entry points but is often ignored or overridden.

BGP ROUTE POLICY — FILTERING AND MANIPULATION

📋

Route Maps and Filtering Tools

POLICY
/* BGP filtering tools */

1. Prefix lists — match by prefix/length
   ip prefix-list BLOCK-DEFAULT seq 5 deny 0.0.0.0/0
   ip prefix-list ALLOW-ALL    seq 10 permit 0.0.0.0/0 le 32

2. AS-PATH access lists — match by regex on AS-PATH
   ip as-path access-list 1 permit ^65002$    # only AS 65002
   ip as-path access-list 2 permit ^65002_    # originated by 65002
   ip as-path access-list 3 deny .*           # deny all

3. Community lists — match by community value
   ip community-list 1 permit 65001:100

4. Route maps — combine match + set operations
   route-map POLICY permit 10
     match ip address prefix-list MY-PREFIXES
     set local-preference 200
     set community 65001:100 additive
   route-map POLICY deny 20  # deny everything else

/* Apply to BGP peer */
router bgp 65001
  neighbor 203.0.113.1 route-map POLICY in   # filter incoming updates
  neighbor 203.0.113.1 route-map POLICY out  # filter outgoing updates

/* AS-PATH prepending — make path look longer to discourage use */
route-map SET-PREPEND permit 10
  set as-path prepend 65001 65001 65001  # prepend own AS 3 times
# Result: route appears 3 hops further away on this path

/* Communities for ISP signaling */
# Send community 65002:100 to ISP → they set your LOCAL-PREF to 100 (low)
# Send community 65002:200 → they set LOCAL-PREF to 200 (high = prefer this path)
# BGP Blackhole community (RFC 7999): 65535:666
# Most ISPs: if you advertise a /32 with 65535:666, they null-route it → DDoS mitigation

BGP SECURITY — ROUTE HIJACKING, RPKI, AND PROTECTION

⚠️

BGP Attacks — Route Hijacking and Leaks

SECURITY
AttackHow It HappensImpactExample
Prefix HijackingAS announces a prefix it doesn't own. BGP prefers more-specific prefixes — attacker announces /24 of a /16 they don't own. Their announcement wins globally.Traffic for the victim prefix redirected to attacker (interception or blackhole)Pakistan Telecom 2008 hijacked YouTube's prefixes for 2 hours
Route LeakAS re-advertises routes it shouldn't — e.g., customer leaks provider's full table to another provider, causing traffic to flow through the customer (sub-optimal or broken)Traffic disruption, possible interceptionCloudflare 2019: AS routing leak from Verizon caused widespread outage
BGP Session HijackingAttacker spoofs TCP RST to tear down a BGP session, disrupting routing updatesBGP convergence event, potential route withdrawal causing traffic dropsRFC 4953 — mitigated by MD5/TCP-AO authentication
🔐

RPKI — Resource Public Key Infrastructure

RPKI

RPKI (RFC 6480) is the cryptographic solution to BGP prefix hijacking. IP address holders (using their RIR account) create signed certificates called Route Origin Authorizations (ROAs) that state "AS X is authorised to originate prefix P/len". Routers with RPKI-enabled BGP validate incoming routes against the ROA database.

/* RPKI Route Origin Validation (ROV) */

ROA: "192.0.2.0/24 may be originated by AS64496, max-length /24"
Signed by: the IP address holder's RIR certificate chain

Router receives BGP update: 192.0.2.0/24 from AS64497
  RPKI check:
    Valid:   prefix+origin matches a ROA → install, prefer
    Invalid: prefix+origin contradicts ROA (wrong AS) → DROP (or low pref)
    Unknown: no ROA exists for this prefix → accept (no info)

/* Validation states */
Valid:   Route passes RPKI validation — safe to use
Invalid: Route fails RPKI — likely hijack → should be dropped
Unknown: No ROA exists — treat as before RPKI (accept, lower preference)

/* FRR RPKI config */
rpki
  rpki cache rpki.example.com 3323 preference 1  # RTR server

router bgp 65001
  bgp bestpath prefix-validate allow-invalid     # don't drop invalid (log only)
  # For production: configure route-map to drop invalid routes

route-map FROM-PEER deny 5
  match rpki invalid   # drop RPKI-invalid routes
route-map FROM-PEER permit 10

/* Check RPKI status */
show bgp ipv4 unicast 192.0.2.0/24  # shows "rpki: valid/invalid/not found"

⚠️ BGP session authentication. Always configure MD5 or TCP-AO authentication on BGP sessions to prevent session teardown via spoofed RST packets: neighbor 203.0.113.1 password strongpassword. MD5 has weaknesses but is widely deployed; TCP-AO (RFC 5925) is the modern replacement.

LAB 1

BGP Peering with FRR — eBGP Between Two ASes

Objective: Configure eBGP peering between two Linux VMs running FRR. Advertise prefixes, observe path attributes, and manipulate best-path selection.

1
Set up two VMs: AS65001 (10.1.0.1) and AS65002 (10.1.0.2) on a shared /30 segment. Install FRR on both. Configure eBGP: on AS65001: neighbor 10.1.0.2 remote-as 65002; on AS65002: neighbor 10.1.0.1 remote-as 65001. Verify session reaches Established: show bgp summary.
2
Advertise a loopback prefix on each side: network 192.0.2.0/24 (AS65001) and network 198.51.100.0/24 (AS65002). Verify routes are received: show bgp ipv4 unicast. Examine the UPDATE: note AS-PATH, NEXT-HOP, ORIGIN attributes.
3
Test LOCAL-PREF: on AS65001, add a second path via a third router (AS65003). Apply a route-map to set LOCAL-PREF=200 on routes from AS65003 and LOCAL-PREF=100 on routes from AS65002. Verify AS65001 prefers AS65003 for routes it can reach via either.
4
Test AS-PATH prepending: on AS65001, apply a route-map outbound to AS65002 that prepends your AS three times. Verify AS65002 sees your AS in the path as "65001 65001 65001 65001" and prefers the shorter path via AS65003.
LAB 2

BGP Route Filtering and Community Tagging

Objective: Implement prefix-list and community-based filtering. Practice the route policy tools used in production ISP and enterprise BGP configurations.

1
Create a prefix-list that accepts only /24 or longer prefixes (reject /8–/23): ip prefix-list ACCEPT-SPECIFICS permit 0.0.0.0/0 ge 24. Apply inbound. Verify: attempt to advertise a /22 from the peer — it should be filtered.
2
Tag all received routes with a community: route-map that sets community 65001:100 on all routes from AS65002. Apply on inbound. Verify: show bgp ipv4 unicast 198.51.100.0/24 shows community value.
3
Use community for conditional policy: create a route-map that sets LOCAL-PREF=150 for routes with community 65001:100, and LOCAL-PREF=50 for community 65001:200. Advertise two prefixes with different communities from AS65002 and verify AS65001 applies different preferences.

M12 MASTERY CHECKLIST

When complete: Move to M13 - MPLS, VxLAN, GRE and Tunneling — the final Phase 3 module covering overlay networks and tunnelling mechanisms critical to modern data centres and VPN deployments.

← M11 OSPF 🗺️ Roadmap Next: M13 - Tunneling →