Lecture 4

Recap

The above work by borrowing routing from L3 into L2
The loops still exist physically, but because there is no need for MAC flooding, this is not a problem

Link Aggregation Control Protocol (LACP)

LACP creates Link Aggregation Groups (LAGs)
Combined connection of physical ports
Pasted image 20250912102354.png|300
Defined in 802.3ad
Made to achieve the following:

In Research

SLICES
Create a fabric connecting research facilities through Europe
Based on L2

L2 Architectures

How do you design an efficient, reliable and flexible network?

Network Size

You need to consider the size:

The larger the network, the bigger the need for load balancing and redundancy

Data Centers

Warning

From here on out we consider L3 capable switches

Mostly rely on L2, but nowadays also L3, with L3 links between racks
10.000 - 100.000 hosts

Three-layer Hierarchy (Mostly Legacy)

Pasted image 20250912103051.png|400
Core:

Collapsed Core Design (Also Legacy)

Pasted image 20250912103500.png|400
Simplify, but still keep benefits of 3-layer design
Lower cost due to less devices
More load on core devices: access control AND throughput
Harder to scale

Problems with three-tier and collapsed core

Oversubscription: not all nodes can transmit at the same time
The links to the internet have less bandwidth than the devices connected to them need
Uses techniques such as Equal Cost Multi Path routing (ECMP)
Performs static load splitting, does not account for flow size
Routing tables become very large due to multiple paths

Fat Trees

Pasted image 20250912103910.png|400
More modern design
Use many small switches
The higher you go in the tree, the more interconnected the devices are
Network divided into pods (the dotted boxes)
Each pod has edge and aggregation switches
Not necessarily a single broadcast domain, can have multiple within it
Can also contain multiple VLANs

Less oversubscription
Not just because of hierarchy, but also rules on how many devices and interconnects a pod should have
Routing and addressing scheme
Less core bottlenecks
Not just a single link between core and aggregation
Can easily scale horizontally by adding more pods
Built on commodity hardware, making it cheaper

K-Tree

Given k ports on a switch

48 ports -> 28.000 hosts w/o oversubscription

Addressing in Fat-Tree

Use 10.0.0.0/8 private IP address space
Pod switches have address 10.pod.switch.1
Core switches have address 10.k.j.i
i and j denotes the position in the core switches
k is constant, to tell it it's not a pod
Hosts have address 10.pod.switch.ID
ID is host-ID in switch subnet
k<256, this scheme does not scale indefinitely

This works because the IP shows topology, making it easier to route

Two-level Lookup Table

Needed because there are many equal paths
Storing all these paths in a normal routing table would be very inefficient

First level:
Prefix lookup
Used to route down the topology to servers (inside a pod)
If the destination is not found, then the destination is not in this pod, so check suffix to see to which core switch it should go
Second level:
Suffix lookup
Used to route up towards core (between pods)
Used to load balance in a random but deterministic way
Pasted image 20250912105352.png|400

Spine-Leaf Topology

We don't always need a full fat tree
Nowadays we got more east-west traffic than north-south (more traffic inside the datacenter, than in/out of the datacenter)
Pasted image 20250912105510.png|400
Spine switches to route between leaf switches
Leaf switch to route between racks
Each rack has its own switch to connect hosts
Rich interconnection among switches
Any server to any other is only 4 hops
Even more scalability by adding more leaf/spine switches
Every leaf connects to every spine

Protocol Innovation

IPv6

Shoulda given it more address space
~ Vint Cerf

4× the bits, but 296 times the address space
Auto configuration
StateLess Adress AutoConfiguration (SLAAC)
Stateful configuration (DHCPv6)

Security: Built-in IPsec

Optimised headers:
Fixed 40 byte length
Extension header mechanism to N×40bytes

Mobility support:
Support for end-to-end route optimisation
Even if one host changes networks

No NAT needed, we have enough addresses
End-to-end principle could be used again
But "smart" hosts in the middle might interfere

IPv6 adoption is quite slow
Started in 2011, now at about 50%

Hierarchical addressing
Needed because of the huge address space
Standard is to follow the 16-32-48-64 for RIP-ISP-ORG-NET

Notation

8 colon (:) separated blocks of 4 hex-digits, totalling 128 bits
Leading zeroes may be skipped
Blocks of all zeroes may be replaced by ::
No broadcasts, only multicast
No subnet masks, only prefixes

IPv4

IPv4 address: 131.211.140.89
Subnet mask: 255.255.255.192
Wildcard mask: 0.0.0.63
Network: 131.211.140.64/26
Broadcast: 131.211.140.127
Mixed notation: 131.211.140.89/26 (Host and network combined)

IPv6

IPv6 address: 2001:0610:0158:bad0:0000:0000:0000:0001
Short form: 2001:610:158:bad0::1
Network: 2001:610:158:bad0::/64
Mixed notation: 2001:610:158:bad0::1/64

IPv6 is globally routable
Pasted image 20250912113447.png|400

Type of Addresses

Unicast: used to identify one interface
Anycast: multiple devices share the same IP, packets get delivered to the "closest" one

IPv6 has 2 addresses per interface:

Link-local unicast is used for communicating within the same LAN
Part of the fe80::/10 address block
Each host can compute it's own link-local address by concatenating the fe80::/10 prefix with the 64 bit identifier of its interface
Needed for neighbour discovery and routing

StateLess Address AutoConfiguration (SLAAC)

First compute link-local address (device can do that on it's own)
Check it with Duplicate Address Detection (DAD), done through multicast
If it's already in use, generate a new one (with MAC or random value)

Then listen for Router Advertisement (RA) messages
Contains prefix of the router and flags
Use the prefix from the router + interface identifier (based on MAC or random value)
Again, check usage with DAD (multicast in LAN, since prefix is unique and only used behind your router)

Multicast Addresses

All end systems automatically belong to the ff02::1 multicast group
All routers automatically belong to the ff02::2 multicast group

Neighbour Discover Protocol (NDP)

IPv6 does not use ARP
IPv6 does use ICMPv6 to:

ICMPv6 Errors

Error messages:

ICMPv6 Information

Information messages:

IPv6 header

Fixed header makes it easier to deal with
Pasted image 20250912115031.png|400
Next header field indicates whether a next header should be expected
No checksum means error checking is left to a higher layer

Header fields:
Pasted image 20250912115125.png|400
Header types:
Pasted image 20250912115201.png|400
IPv6-Frag used to indicate the sender is fragmenting
Pasted image 20250912115248.png|400