FortiOS 7.6.6 SD-WAN: VRF1 Transport and Loopback Design
A previous post on this site covered the basic three-VRF SD-WAN pattern with the underlay in VRF 0. That layout works, but on a multi-tenant or audit-bound estate it conflates the system plane (FortiGuard, FortiCloud, NTP, DNS, FortiManager registration) with the transport plane. Both live in VRF 0 by default and there’s nothing structural separating them.
This post refines the design for FortiOS 7.6.6, follows current Fortinet documentation and best-practice guidance, and addresses three questions that came up in conversation:
- Should the transport live in VRF 1 instead of VRF 0?
- Do we need separate management and BGP loopbacks, and where do they live?
- Do we need NPU-VLINKs?
The refined reference design
Same three customer VRFs, but VRF 0 becomes a system-only VRF and VRF 1 becomes the transport.
| VRF | Purpose | Notes |
|---|---|---|
| VRF 0 | System plane only — HA heartbeat, FortiCloud, central-management default routes | Cannot be eliminated. HA heartbeat must remain here. |
| VRF 1 | Transport / underlay — WAN, IPsec tunnels, BGP loopback, default route to ISP | Pinned source for FortiGuard, NTP, DNS, syslog. |
| VRF 20 | Out-of-band management — FortiAnalyzer, FortiManager (data path), SNMP | Has its own loopback. |
| VRF 30 | Customer SD-WAN — LAN, servers, voice | Has its own loopback for in-VRF service sourcing. |
| VRF 99 | Guest Wi-Fi | DIA via VRF 1 with NAT. No loopback. |
Everything that used to default into VRF 0 must now be explicitly pinned to VRF 1 or whichever customer VRF needs it. That’s the cost; the benefit is that any service that isn’t explicitly pinned will fail loudly and you’ll find it on day one rather than year one.
Loopback design — and the answer to “do we need both?”
Yes, you need separate loopbacks. The MGMT loopback and the BGP loopback live in different VRFs and serve different purposes. Combining them defeats the design.
Best practice on FortiOS 7.6.6 for an MP-BGP/VRF SD-WAN is one loopback per VRF that has a control or management role:
| Loopback | VRF | Purpose |
|---|---|---|
lo-transport | 1 | iBGP update-source, BGP router-id, ADVPN local-id, source for FortiGuard / NTP / DNS / external-resource fetches |
lo-mgmt | 20 | Source-IP for FortiAnalyzer logs, FortiManager management traffic, SNMP traps, syslog destined for the management network |
lo-cust | 30 | Source for in-VRF probes, NetFlow exporter, application telemetry that should appear from the customer VRF |
| (none) | 99 | Guest VRF has no control plane; it’s pure NAT egress |
Why separate:
- Different reachability scope.
lo-transportonly needs to be reachable inside the underlay; you don’t advertise it via MP-BGP into customer VRFs.lo-mgmtis advertised into VRF 20 with RT 65000:20 and reachable from FortiManager / FortiAnalyzer at the hub. - Different RT membership. The BGP loopback isn’t a customer prefix. If you put it in VRF 30 to “share” it, you’ve leaked your control-plane address into a customer RIB.
- Different failure semantics. If an operator drops VRF 30 to debug a customer issue, your iBGP session shouldn’t go down with it. Putting BGP in VRF 1 isolates the control plane from per-tenant churn.
- Diagnostic clarity.
get router info bgp summaryshows VRF 1 sessions;exec ping vrf 20 <faz-ip>proves the management path. They’re testable independently.
The mistake to avoid is the “single loopback for everything” pattern that’s fine on a small-office FortiGate but fragments under any kind of segmentation requirement.
Configuration walkthrough
1. Transport VRF (VRF 1)
WAN, transport loopback, IPsec tunnels, default route — all in VRF 1.
config system interface
edit "wan1"
set vrf 1
set role wan
set ip 203.0.113.10 255.255.255.0
next
edit "lo-transport"
set vrf 1
set type loopback
set ip 10.255.1.11 255.255.255.255
set allowaccess ping
next
edit "advpn-hub"
set vrf 1
set ip 10.200.0.11 255.255.255.255
set remote-ip 10.200.0.1 255.255.255.255
set interface "wan1"
set type tunnel
next
end
config router static
edit 1
set dst 0.0.0.0 0.0.0.0
set gateway 203.0.113.1
set device "wan1"
set vrf 1
next
end
The default route to the ISP is now in VRF 1’s RIB. Any service that wants to reach the internet must source itself from a VRF-1 address — which is the next section.
2. Pin every system-plane service to VRF 1
This is the part most operators get wrong on day one. FortiOS 7.6.6 supports set vrf-select on the management-plane services that need it, plus the older set source-ip / set interface-select-method specify knobs on everything else. The minimum complete set:
config system dns
set primary 1.1.1.1
set secondary 9.9.9.9
set source-ip 10.255.1.11
set interface-select-method specify
set interface "lo-transport"
end
config system ntp
set ntpsync enable
set source-ip 10.255.1.11
set interface-select-method specify
set interface "lo-transport"
config ntpserver
edit 1
set server "uk.pool.ntp.org"
next
end
end
config system fortiguard
set source-ip 10.255.1.11
set interface-select-method specify
set interface "lo-transport"
set vrf-select 1
end
config system central-management
set type fortimanager
set fmg "10.20.10.10"
set vrf-select 1
end
config log fortianalyzer setting
set status enable
set server "10.20.10.20"
set source-ip 10.20.10.1 ; lo-mgmt in VRF 20 — see below
set interface-select-method specify
set interface "lo-mgmt"
end
config system snmp sysinfo
set status enable
end
config system snmp community
edit 1
set name "ro"
config hosts
edit 1
set ip "10.20.10.30 255.255.255.255"
set source-ip 10.20.10.1
set interface "lo-mgmt"
next
end
next
end
Note the asymmetry: FortiGuard / NTP / DNS source from VRF 1 (they go to the public internet), but FortiAnalyzer and SNMP source from VRF 20 (they go to internal management subnets reachable through MP-BGP RT 65000:20). FortiManager registration goes via VRF 1 if your FortiManager is internet-reachable, or via the management VRF if it’s on-prem — pick one and pin accordingly.
3. Management VRF (VRF 20) loopback
config system interface
edit "mgmt"
set vrf 20
set ip 172.20.10.1 255.255.255.0
set allowaccess ping ssh https
next
edit "lo-mgmt"
set vrf 20
set type loopback
set ip 10.20.10.1 255.255.255.255
set allowaccess ping
next
end
lo-mgmt is what FortiAnalyzer sees as the source IP. It’s stable across physical-interface changes and it’s in the management VRF, so MP-BGP advertises it cleanly with RT 65000:20.
4. Customer VRFs (30 and 99)
config system interface
edit "lan-vlan30"
set vrf 30
set ip 10.30.10.1 255.255.255.0
next
edit "lo-cust"
set vrf 30
set type loopback
set ip 10.30.0.1 255.255.255.255
next
edit "guest-vlan99"
set vrf 99
set ip 192.168.99.1 255.255.255.0
next
end
5. iBGP and MP-BGP — sourced from VRF 1
config router bgp
set as 65000
set router-id 10.255.1.11
set ibgp-multipath enable
config neighbor
edit "10.255.1.1"
set remote-as 65000
set update-source "lo-transport"
set ebgp-enforce-multihop enable
set soft-reconfiguration enable
set capability-graceful-restart enable
set additional-path receive
set advertisement-interval 1
set address-family vpnv4
next
end
config vrf
edit "20"
set role ce
set rd "65000:20"
set export-rt "65000:20"
set import-rt "65000:20"
next
edit "30"
set role ce
set rd "65000:30"
set export-rt "65000:30"
set import-rt "65000:30"
next
edit "99"
set role ce
set rd "65000:99"
set export-rt "65000:99"
set import-rt "65000:99"
next
end
config network
edit 1
set prefix 10.20.10.0 255.255.255.0
set vrf 20
next
edit 2
set prefix 172.20.10.0 255.255.255.0
set vrf 20
next
edit 3
set prefix 10.30.10.0 255.255.255.0
set vrf 30
next
edit 4
set prefix 10.30.0.1 255.255.255.255
set vrf 30
next
end
end
The iBGP session is between lo-transport (VRF 1) at the branch and the equivalent loopback at the hub. Customer VRFs ride on top via the VPNv4 address family. The router-id is the VRF 1 loopback — keep it consistent across reboots.
6. Guest DIA — leak from VRF 99 to VRF 1
config router static
edit 99
set dst 0.0.0.0 0.0.0.0
set gateway 203.0.113.1 ; ISP next-hop, resolved in VRF 1
set device "wan1"
set vrf 99
set dstvrf 1
next
end
config firewall policy
edit 100
set name "guest-dia"
set srcintf "guest-vlan99"
set dstintf "wan1"
set srcaddr "guest-net"
set dstaddr "all"
set action accept
set service "DNS" "HTTP" "HTTPS" "PING"
set nat enable
set utm-status enable
set ssl-ssh-profile "certificate-inspection"
set webfilter-profile "guest-webfilter"
set logtraffic all
next
end
dstvrf 1 rather than dstvrf 0 is the only line that actually changes from the VRF-0-underlay version. NAT remains mandatory.
NPU-VLINKs — when you need them, when you don’t
set dstvrf performs inter-VRF lookup in software. On NP6 / NP7 hardware that’s a CPU cost, and at scale (large guest populations, high-throughput shared-services flows) it shows up as elevated CPU and reduced session-setup rate. The Fortinet-recommended way to keep inter-VRF traffic on the NPU fast path is NPU-VLINKs — a pair of virtual interfaces (npu0_vlink0 / npu0_vlink1, or per-NP-instance equivalents) that exist as a hardware-internal bridge between two VRFs.
You configure them as a normal pair, drop each end in a different VRF, and route through them:
config system interface
edit "npu0_vlink0"
set vdom "root"
set vrf 1
set ip 169.254.99.1 255.255.255.252
next
edit "npu0_vlink1"
set vdom "root"
set vrf 99
set ip 169.254.99.2 255.255.255.252
next
end
config router static
edit 99
set dst 0.0.0.0 0.0.0.0
set gateway 169.254.99.1
set device "npu0_vlink1"
set vrf 99
next
end
Then the policy from guest-vlan99 egresses to npu0_vlink1 and a second policy carries it from npu0_vlink0 out wan1 with NAT. The two halves of the policy lookup happen in separate VRFs, but the actual packet handoff is offloaded to the NPU rather than punted to CPU.
Do you need them?
Use NPU-VLINKs when:
- The platform has an NP6 or NP7 (most hardware FortiGates from 100F upward; check
get hardware npu np6 statusorget hardware npu np7 status). - Inter-VRF flow rates are non-trivial — guest Wi-Fi at scale, shared-services VRFs, transit-VRF designs.
- You’re chasing CPU headroom on a busy box.
- Throughput SLAs depend on the inter-VRF path staying offloaded.
Skip NPU-VLINKs when:
- The platform is a VM (FortiGate-VM has no NPU).
- The inter-VRF path is subject to UTM (web filter, IPS, AV). UTM forces the session to CPU regardless of NPU-VLINK; you gain nothing from the offload.
- Volumes are low (small branch with a handful of guests).
- The simpler
dstvrfstatic route is sufficient and the operational cost of two extra interfaces, two extra IPs, and two extra policies isn’t justified.
For the guest DIA case specifically, the realistic answer is: if you’re applying web filtering to guest traffic (you should be), the UTM cost dominates and NPU-VLINK adds complexity without much performance benefit. For a transit VRF or a shared-services VRF carrying clean L4 traffic, NPU-VLINK is worth the wiring.
Verification
get hardware npu np6 status ; or np7 on newer hardware
diagnose sys session list | grep npu
diagnose npu np6 sse-stats ; per-NPU session offload stats
If npu_state on a session shows the offload bit set after it’s been established for a few seconds, the NPU is handling it. If it stays in slow-path, something (UTM, fragmentation, asymmetric routing) is keeping it on the CPU.
Updated traffic flows
The flow logic from the earlier post still holds; the only mechanical change is that “VRF 0” becomes “VRF 1” everywhere except for HA heartbeat and FortiCloud auto-registration.
Management to FortiAnalyzer: Branch sources from lo-mgmt (10.20.10.1, VRF 20) → BGP best-path in VRF 20 → next-hop lo-transport at hub → recursive lookup in VRF 1 → IPsec encap → wan1 → ISP → hub decap → hub VRF 1 → MP-BGP RT 65000:20 import → hub VRF 20 → FortiAnalyzer.
Customer SD-WAN flow: unchanged conceptually; tunnels and SD-WAN health-checks now live in VRF 1 instead of VRF 0.
Guest DIA: unchanged in shape; dstvrf 1 instead of dstvrf 0. With NPU-VLINK, replace the direct dstvrf leak with the vlink pair.
Updated gotchas
The list from the earlier post still applies. Add these to it for FortiOS 7.6.6 with a VRF-1 transport:
1. HA heartbeat must stay in VRF 0. This is non-negotiable on 7.6.6 — hbdev and the FGCP heartbeat traffic do not honour set vrf on the heartbeat interface. Keep at least one dedicated heartbeat link untouched in VRF 0. If you re-VRF the heartbeat interface, the cluster splits at the next failover.
2. FortiCloud auto-registration sources from VRF 0. Even with vrf-select, the initial FortiCloud onboarding traffic uses VRF 0. Either give VRF 0 a route to the internet (less clean, but matches Fortinet’s default expectation) or pre-register the device against FortiCloud out-of-band before locking VRF 0 down.
3. vrf-select is not on every service. In 7.6.6, FortiGuard, central-management, and a few others have a clean set vrf-select N knob. Logging, SNMP, syslog still rely on the older source-ip + interface-select-method specify pattern. Don’t assume vrf-select exists on every service — check per-service in the CLI before relying on it.
4. Default-route ordering. With a default in VRF 1 and (intentionally) no default in VRF 0, anything that ends up sourcing from VRF 0 by accident becomes immediately broken. That’s actually the design goal — but it bites during initial commissioning when one knob isn’t pinned yet. Build the box, pin every service, then remove the VRF 0 default if it was there for bootstrapping.
5. Loopback allowaccess still applies. A loopback with allowaccess ssh https is reachable from anywhere that has a route to its address. If lo-mgmt is in VRF 20 and VRF 30 has no route to it, you’re safe; if you accidentally leak the route, you’ve just exposed your management plane. Keep RT imports tight.
6. ADVPN shortcut tunnels and VRFs. ADVPN 2.0 (in 7.6) creates branch-to-branch shortcuts dynamically. The shortcut tunnel inherits the VRF of its parent overlay — VRF 1 in this design. Any branch that puts the parent in a different VRF won’t form shortcuts with this one. Standardise the transport VRF number across the fleet.
7. set vrf on a tunnel interface is sticky. Changing the VRF of an active tunnel interface tears the SA down and requires a re-key. Plan VRF assignments before turning the tunnel up.
8. SD-WAN zones don’t span VRFs cleanly. Zone members should all be in the same VRF — VRF 1 in this design. If you mix zone members across VRFs, route lookup gets unpredictable and SLA logic breaks.
9. NPU-VLINK MTU. The default MTU on NPU-VLINK pairs is 1500. If your guest path is fragmenting at the WAN edge (PPPoE, GRE-encapsulated transport), set MTU explicitly on both ends of the vlink pair to match the eventual WAN MTU minus overhead. Mismatched MTU on the vlink pair is a silent fragmentation source.
10. set vrf-select and policy-route interaction. A policy-route with set vrf overrides the routing-table lookup. If a policy-route sends a packet to VRF 1 but the source service was bound to VRF 20 via vrf-select, you’ll get unexpected behaviour. Avoid mixing the two for the same flow.
Verification commands worth memorising (updated)
# Per-VRF state
get router info routing-table vrf 1 all
get router info routing-table vrf 20 all
get router info routing-table vrf 99 all
# BGP control plane
get router info bgp summary
get router info bgp vpnv4-unicast all
get router info bgp vpnv4-unicast neighbors 10.255.1.1 advertised-routes
# Management-plane source verification
exec ping-options interface "lo-transport"
exec ping pool.ntp.org ; should source from 10.255.1.11
exec ping-options interface "lo-mgmt"
exec ping 10.20.10.20 ; FAZ — should source from 10.20.10.1
# FortiGuard / cloud reachability
diagnose debug application update -1
diagnose test application fgtlog 1
get system fortiguard
exec central-mgmt status
# NPU offload
get hardware npu np7 status
diagnose npu np7 sse-stats
diagnose sys session list | grep -E "npu|state"
# Inter-VRF leak
get router info routing-table details 0.0.0.0
diagnose ip rtcache list
Migration notes from a VRF 0 underlay
If you’re moving an existing fleet from VRF 0 underlay to VRF 1, do it greenfield-per-site, not in-place. The reason is the day-one silent-failure problem: any system-plane service you forget to pin keeps using VRF 0 with no route to the internet, and the failure modes (FortiGuard signature staleness, log loss, NTP drift) are slow-burning rather than immediate.
A migration sequence that’s worked for me:
- On a lab device, build the full VRF-1 config and verify every service from the pinning list above (
exec ping-optionsfor each). - Cut a single low-stakes branch over and let it run for a week. Monitor FortiAnalyzer for log gaps and FortiGuard contract status daily.
- Roll out per region after that.
- Don’t remove the VRF 0 default route until every service is verified pinned. It’s a cheap safety net.
The end state is what you wanted: VRF 0 holds nothing but HA heartbeat and FortiCloud bootstrapping, every other plane is explicitly assigned, and the audit conversation about “where does management traffic go?” has a one-sentence answer.
Closing
The single-line summary: VRF 1 transport, separate transport and management loopbacks, every system-plane service explicitly pinned, NPU-VLINKs only where the offload pays for the extra config. That’s the FortiOS 7.6.6 best-practice shape, and it scales without re-design from a single hub-and-spoke pair to a multi-tenant managed-service deployment.
The earlier post in this series covers the underlying MP-BGP + VRF mechanics if you want the foundational read first.