Arista (VMware) SD-WAN Deep Dive — Part 3: The Data Plane — VCMP, DMPO, and Per-Flow Steering
Series map. Part 3 of five.
- Components, Gateways, and the Three Planes
- Routing — Overlay, Underlay, BGP, and the Gateway as Route Reflector
- The Data Plane — VCMP, DMPO, and Per-Flow Steering (this post)
- Topology Walkthroughs — MPLS-only meets Internet-only Across Continents
- Best Practice, Failure Modes, and a Design Checklist
Part 2 ended with the Edge having decided which overlay path to use — Direct, Hub, or Gateway, to a specific peer. This post is everything that happens between that decision and the bits leaving the wire.
There are three pieces. They run continuously, in parallel, and they talk to each other:
- VCMP — the encapsulation that wraps every customer packet for transit between Arista SD-WAN nodes.
- DMPO — the measurement and remediation engine that picks the underlay path under each VCMP tunnel and applies on-path fixes when the underlay starts to fail.
- Business Policy — the per-flow classification and priority rules that decide what DMPO is allowed to optimise for and which underlays / overlays a given flow is allowed on.
Order of operations on the Edge for an outbound packet:
LAN packet in
→ policy classifier (Business Policy)
→ flow lookup (existing flow? new flow?)
→ destination overlay route selection (from Part 2)
→ DMPO underlay path selection
→ VCMP encapsulation (encrypt + header + remediation flags)
→ UDP/2426 out the chosen underlay interface
Let’s go through each one.
VCMP — what’s on the wire
VCMP is the proprietary encapsulation Arista SD-WAN (still using the VeloCloud name in protocol terms) uses for every overlay packet. It’s transported over UDP on port 2426 by default. UDP rather than IP-proto because UDP traverses CGNAT, hotel Wi-Fi, and 90% of broken middleboxes without complaint — which matters when “Internet underlay” sometimes means “consumer-grade DSL”.
The packet on the wire looks like:
+------------------------+
| Outer Ethernet |
+------------------------+
| Outer IP (src/dst Edge | <- underlay-routable addresses for this path
| WAN interfaces, or | (public IP on Internet, VRF-attached IP on
| Edge ↔ Gateway pair) | MPLS)
+------------------------+
| Outer UDP (sport, 2426)|
+------------------------+
| VCMP header | <- flow ID, sequence, segment ID, flags
+------------------------+
| AES-encrypted payload | <- the original customer L2/L3 frame
+------------------------+
The pieces that matter operationally:
- Flow ID — every flow has a unique identifier the two endpoints agree on. This is what lets remediation re-order arriving packets back to the original send order even when DMPO duplicated them on two paths.
- Sequence number — per-flow, monotonically increasing. Used for FEC reconstruction and for jitter buffer reordering.
- Segment ID — which VRF this packet belongs to. The receiving node uses this to pick the right egress forwarding table.
- Flags — most importantly, whether this packet is a duplicate (sent on a second underlay path), an FEC parity packet, or a real data packet. Also whether the packet should be acknowledged (control sub-channel) or not (data fast path).
- AES-encrypted payload — the original packet, encrypted with a per-tunnel key negotiated at tunnel establishment. Key rotation happens periodically without dropping the tunnel.
The MTU consequence is real: an outer L3 + UDP + VCMP header costs about 40 bytes plus the encryption overhead. The Edge advertises a reduced effective MTU on the LAN side (PMTUD-friendly) and DF-bit-sets the outer packet so PMTUD on the underlay actually works. Mis-aligned MTU between Edge LAN-side and any LAN-side router is one of the most common silent performance problems; we’ll mention it again in Part 5.
A tunnel exists per underlay path per peer. If Bristol has MPLS + broadband, and it’s running a Direct tunnel to HQ which also has MPLS + DIA, the Edge has four VCMP tunnels to HQ simultaneously — one per (Bristol underlay × HQ underlay) combination. DMPO picks among them at flow time. This is why “tunnel count” on an Edge grows quickly as you add Edges and underlays and allow Direct mode.
DMPO — the measurement loop
DMPO (Dynamic Multi-Path Optimisation) is the engine that turns a noisy underlay into a usable overlay. Two phases, both running continuously per tunnel:
Phase 1 — measurement
Every VCMP tunnel is being measured all the time, in two modes:
- Synthetic probes. When a tunnel is idle, the Edge sends timestamped VCMP probe packets — small, frequent (sub-second cadence by default) — and the far end echoes them back. Round-trip latency, jitter (variation in inter-arrival), and probe loss are computed locally on each Edge.
- Passive piggybacking. When the tunnel is carrying traffic, every data packet carries its own sequence number and timestamp in the VCMP header. The far end reports the observed inter-arrival statistics back over the control sub-channel. So a busy tunnel doesn’t need separate probes — its own traffic is the measurement.
This gives the Edge per-tunnel, per-underlay-path state at all times. The three numbers that drive everything else:
- Latency — one-way, computed from the timestamp difference (assuming reasonable clock skew tolerance; DMPO doesn’t need NTP-grade clocks because it tracks deltas).
- Jitter — variance of inter-arrival, computed over a sliding window.
- Loss — sequence-gap counted over a sliding window.
The Edge holds these as both instantaneous (last few seconds) and smoothed (last few minutes), and compares both to SLA thresholds defined per application class.
Phase 2 — selection and remediation
When a new flow lands and needs a tunnel, DMPO picks the best one — best meaning “passes the SLA for this flow’s class with the most margin”. An interactive voice flow has tight jitter and loss thresholds; a backup flow has loose latency and very loose jitter. Same Edge pair, same underlay options, different DMPO output.
Once the flow is established, DMPO continues to evaluate. If the chosen underlay degrades, DMPO does one of three things, in escalating order of cost:
1. Forward Error Correction (FEC)
For flows where loss is the worst symptom, DMPO can turn on FEC: for every N data packets, send a parity packet that lets the receiver reconstruct one lost packet out of every N+1. Costs ~1/N extra bandwidth. Used aggressively on voice / video where retransmits aren’t tolerable.
This is on-path — neither Edge needs to reorder the flow back to the LAN, because the FEC is constructed and consumed at the VCMP layer. The LAN-side packets come out in order.
2. Packet duplication
For flows where any loss matters and there’s spare capacity on a second underlay, DMPO can send each packet on both underlays simultaneously. The receiving Edge sees two copies, drops the second, and forwards one. Doubles bandwidth cost, eliminates loss as long as at least one of the two underlays delivers.
Duplication is brutal but effective; it’s the right answer for a small number of high-value flows (an interactive trading desk, a critical voice trunk). Don’t turn it on for everything.
3. Path switching
If the underlay degrades past where FEC and duplication help, or if a better path becomes available, DMPO moves the flow to a different underlay tunnel mid-flow. The flow ID stays the same, the sequence numbers continue, the LAN sees no disruption. The new underlay carries the rest of the packets.
This is the headline DMPO feature and it works because the overlay flow is decoupled from the underlay tunnel. From the LAN’s point of view, there is one persistent overlay session. From DMPO’s point of view, that session has been migrated from broadband to MPLS without the LAN noticing.
Jitter buffer
A small but important piece: the receiving Edge has a per-flow jitter buffer for high-priority flows. Out-of-order arrivals — which happen naturally when duplication is on, or when a path switch occurs and a few in-flight packets arrive on the old path after newer packets have arrived on the new path — are reordered before being handed to the LAN. The buffer is sized small (single-digit ms typical) so it doesn’t add user-perceptible latency to the flow.
Business Policy — the per-flow ruleset
DMPO is the engine; Business Policy is the configuration that tells it what to do for which flows.
Each policy rule has, roughly:
- Match criteria — application (DPI), 5-tuple, user/group, source/destination prefix, segment, time-of-day.
- Priority class — High / Normal / Low. Drives queuing and DMPO’s SLA targeting.
- Service definition — must traverse a specific overlay path (e.g. always go to HQ via Hub), or specific service insertion (e.g. always egress via Cloud Web Security).
- Link selection — preferred underlay, mandatory underlay, backup underlay. “Voice on MPLS only”, “Bulk on broadband first, MPLS only if broadband down”, etc.
- Remediation profile — when degradation is detected, which remediations are allowed. Voice: FEC + duplication. Backup: nothing.
Rules are ordered, first-match-wins. The Edge classifies the first packet of every flow, picks the matching rule, and the rest of the flow follows that rule’s path.
Two things people get bitten by:
- Application classification needs a few packets. DPI doesn’t know what’s in TCP SYN. So the first 2–3 packets of a new flow might be classified differently from the rest. This is fine for almost everything; it matters for very-short connections (sub-second HTTPS) where the entire flow is over before DPI converges. There’s a knob to set per-policy “early classification by destination” to handle this.
- Encrypted traffic. TLS 1.3, QUIC, and ECH have eroded the visibility DPI used to have. The product compensates with destination-based classification (known SaaS IP ranges → known app) and SNI sniffing where ECH isn’t in play. If you assume the DPI tells you what’s in every QUIC stream, you’ll be disappointed. We talk about this more in Part 5.
Per-flow path selection — worked example
Bristol Edge has:
- MPLS underlay (50 Mbps, 5ms RTT to HQ).
- Broadband underlay (200 Mbps, 12ms RTT to HQ, occasional jitter spikes).
- Direct VCMP tunnels to HQ on each underlay — so two Direct tunnels exist.
A flow opens for Teams audio. Business Policy:
- Match: application = Microsoft Teams (Audio sub-classifier), Voice priority class.
- Link selection: prefer broadband (cheap), require jitter < 15ms.
- Remediation: FEC, allow duplication if jitter exceeds threshold for >2s.
DMPO state at flow open:
- MPLS: 5ms RTT, 0.5ms jitter, 0% loss. In SLA.
- Broadband: 12ms RTT, 4ms jitter, 0.1% loss. In SLA.
Decision: use broadband. Lower priority underlay-preference per policy, both in SLA.
Mid-call, broadband jitter spikes to 25ms (a neighbour torrents something). DMPO sees broadband fall out of SLA. Remediation policy allows duplication → DMPO starts duplicating every packet onto MPLS as well. From the next packet onwards, every audio packet is on both paths. The receiving Edge picks the first to arrive, drops the second, hands it to the LAN. The call doesn’t notice.
Two seconds later, broadband is still bad. DMPO switches the flow to MPLS only. The duplicate-on-MPLS is now the primary; broadband is dropped from the flow. Bandwidth cost returns to 1×.
Three seconds after that, broadband recovers. DMPO does not immediately switch back — there’s hysteresis in the path-selection logic specifically to avoid flapping. It waits until broadband has been in SLA for a longer window before considering it eligible again.
The LAN never sees a packet loss, a re-INVITE, or a codec downgrade.
Symmetric vs. asymmetric flow handling
A common worry: if DMPO at the Bristol end picks MPLS, but DMPO at the HQ end picks broadband for the return, won’t asymmetry break things?
In Direct mode, no — because both Edges share the flow ID and flow-affinity info via the VCMP control sub-channel. The HQ Edge sees Bristol’s outbound on the MPLS tunnel, learns the flow’s preferred underlay, and pins its return packets to the same flow’s tunnel unless its own DMPO state forces it onto a different one. Asymmetry is allowed but discouraged at flow level.
In Cloud Gateway mode, the Gateway sees both halves and enforces flow symmetry by routing the return through the same tunnel pair. This is one of the points where Gateway-mediated has an edge over Direct for tricky environments.
In Partner Gateway mode, same as Cloud Gateway — the Partner Gateway sits on both halves of the flow.
Latency, jitter, and the wide-area question
This series exists in part because the Edge being smart doesn’t help if the underlay between Edge and Gateway is the bottleneck. DMPO measures and reacts to the Edge ↔ peer path — which, for a flow going Edge → Gateway → Edge, is two paths chained, and remediation happens on each leg independently. Loss on the Edge → Gateway leg is fixed at that leg; loss on the Gateway → Edge leg is fixed at that leg; the two halves don’t have a unified end-to-end remediator.
For most deployments this is fine. For pathological cases — Edge in China going via a Gateway in the UK because that’s where the customer’s Gateway pool lives — it isn’t. We come back to this hard in Part 4.
What’s next
You now have the full picture: Edges advertise (Part 2), the Gateway reflects (Part 2), Edges select an overlay path (Part 2), DMPO selects an underlay path (this post), VCMP encapsulates (this post), and remediation happens in-flight (this post).
Part 4 puts it all on the table and walks through the actual GlobalCo packet flows from Part 1’s scenario — including the awkward ones where it really shouldn’t work but it does, and the ones where it really shouldn’t work and it doesn’t.