Back to Blog
EVPN-VXLAN Fabrics

EVPN-VXLAN Fabric Design: From Fundamentals to Production

A practical guide to designing EVPN-VXLAN data center fabrics, covering overlay/underlay architecture, multi-tenancy, and operational best practices.

February 15, 20263 min readevpn, vxlan, fabric-design, data-center

Why EVPN-VXLAN?

EVPN-VXLAN has become the de facto standard for modern data center fabrics. It decouples the logical network (overlay) from the physical topology (underlay), giving operators the flexibility to build multi-tenant, scalable, and operationally simple networks.

But "simple" is relative — getting EVPN-VXLAN right requires understanding the interplay between BGP control plane, VXLAN data plane, and the operational tooling that ties it all together.

The Architecture

Underlay: IP Fabric

The underlay is a pure Layer 3 IP fabric, typically a Clos (spine-leaf) topology:

  • Leaf switches: Top-of-rack (ToR), connect servers and endpoints
  • Spine switches: Interconnect all leaves, provide equal-cost paths
  • Routing protocol: eBGP (one AS per leaf, shared AS per spine tier) or OSPF/IS-IS

The underlay's only job is to provide IP reachability between VTEP (VXLAN Tunnel Endpoint) loopback addresses. Keep it simple.

Overlay: EVPN + VXLAN

The overlay runs EVPN (RFC 7432) over iBGP with route reflectors (typically the spine switches). VXLAN encapsulates Layer 2 frames in UDP, tunneling them across the IP underlay.

Key EVPN route types:

  • Type 2 (MAC/IP Advertisement): Distributes MAC and IP bindings
  • Type 5 (IP Prefix): Distributes IP prefixes for inter-VRF routing
  • Type 3 (Inclusive Multicast): Handles BUM traffic (broadcast, unknown unicast, multicast)

Design Decisions That Matter

1. Symmetric vs. Asymmetric IRB

For inter-VXLAN routing, you have two models:

  • Asymmetric: Routing happens at the ingress leaf, bridging at the egress. Requires all VNIs on all leaves.
  • Symmetric: Both ingress and egress perform routing. Uses a transit L3 VNI. More scalable.

Recommendation: Use symmetric IRB for production. It scales better and doesn't require every VLAN to exist on every leaf.

2. Multi-tenancy with VRFs

Each tenant gets its own VRF with a unique L3 VNI. This provides:

  • Complete isolation between tenants
  • Independent routing tables
  • Per-tenant policy and security

3. BUM Traffic Handling

Choose between:

  • Ingress replication: Each VTEP replicates BUM to all remote VTEPs (simpler, no multicast required)
  • Multicast underlay: Uses PIM-SM for efficient BUM distribution (more complex, better at scale)

For most deployments under 100 leaves, ingress replication works well.

Operational Best Practices

  1. Use intent-based tools: Platforms like Apstra automate Day 0-2 operations, validate intent against state, and detect anomalies
  2. Monitor EVPN route counts: Unexpected route growth often signals misconfigurations
  3. Standardize VNI allocation: Use a consistent mapping (e.g., VNI = VLAN ID + 10000)
  4. Test failover regularly: Validate that traffic reconverges within your SLA after spine/leaf failures

Coming Up

Next in this series: deep dives into EVPN Type-5 routes for DC interconnect, migration strategies from traditional L2 networks to EVPN-VXLAN, and automation workflows for fabric lifecycle management.