A Day in the Life of a Packet on a 50G FortiGate, Part 3: Routing, Policy Routes, and SD-WAN Service Rules

A Day in the Life of a Packet on a 50G FortiGate, Part 3: Routing, Policy Routes, and SD-WAN Service Rules

Recap. In Part 2 the packet survived RPF, the DoS sensor, and IP integrity, and arrived at the kernel’s routing decision as a brand-new flow with a half-built session entry. Today we work out where the packet is going.

The myth: “the FortiGate looks at the routing table.” The reality: the FortiGate looks at several tables in a strict order, and only some of them are what you’d traditionally call routing tables. If you skip past this and assume the FIB is the source of truth, you will spend hours debugging an SD-WAN rule you forgot you wrote.

The order:

  1. Policy routes (PBR) — manually configured policy-based routing.
  2. SD-WAN service rules — application-aware, SLA-aware path selection. Rendered into the kernel as policy routes with a special index range.
  3. FIB lookup — the actual routing table.

Each is checked top-down, first match wins, fall-through to the next. If nothing matches, RPF fails or the packet is dropped as unroutable.

This is also the most CLI-rich part of the journey, so I’ll lean heavily on commands.

Stage 1: Policy routes

A policy route is a manually configured rule that says “if a packet has these characteristics, send it via this gateway out this interface, ignoring whatever the FIB would have said.”

config router policy
    edit 1
        set input-device "port3"
        set src "10.10.10.0/24"
        set dst "0.0.0.0/0"
        set protocol 6
        set start-port 443
        set end-port 443
        set gateway 203.0.113.1
        set output-device "port4"
    next
end

Match criteria are fewer than firewall policies — input interface, source, destination, protocol, port range, ToS — but they trump everything else.

To inspect:

diagnose firewall proute list
diagnose firewall proute6 list
get router info routing-table details

diagnose firewall proute list shows everything in the policy-route table — your hand-written rules plus the SD-WAN service rules that the SD-WAN engine has compiled into policy-route form. SD-WAN entries have IDs like 4294967294 (or 0xFFFFFFFE) descending — they live at the high end of the index space, so your hand-written rules win unless you’re very deliberate.

A useful query:

diagnose firewall proute list | grep -A5 'iif='

To trace why a particular flow took (or didn’t take) a policy route, use diag debug flow show iprope enable and watch:

msg="Match policy routing id=1: to 203.0.113.1 via ifindex-7"

or:

msg="No policy route match"

Policy routes accept a gateway of 0.0.0.0 with output-device set; that means “use this output interface but resolve next-hop via FIB on that interface.” Useful when you want to force traffic out a specific egress without hard-coding the next hop.

A lesser-known feature: a policy route’s action can be permit (default) or deny. deny means “ignore this PBR for this match and fall through to the next layer” — i.e. carve a hole in the PBR.

Stage 2: SD-WAN service rules

This is where most of FortiOS’s brain lives in a modern deployment.

SD-WAN in FortiOS has three layers:

  1. Members — the physical or logical interfaces participating in SD-WAN. Each member has a gateway, a cost, a priority, and optionally a source IP for SLA probes.
  2. Health checks (Performance SLAs) — periodic probes (ping, http, http-get, dns, twamp, tcp-echo, tcp-connect) from each member to one or more targets, computing per-member latency, jitter, and packet loss, and producing a binary in-SLA / out-of-SLA per SLA target.
  3. Service rules — match traffic by source, destination (address, internet service, application), user, schedule, and pick a member based on a strategy.

Members

config system sdwan
    set status enable
    config zone
        edit "underlay"
        next
        edit "overlay"
        next
    end
    config members
        edit 1
            set interface "port4"
            set gateway 203.0.113.1
            set cost 10
            set zone "underlay"
        next
        edit 2
            set interface "port5"
            set gateway 198.51.100.1
            set cost 20
            set zone "underlay"
        next
        edit 3
            set interface "ipsec_hub1"
            set zone "overlay"
        next
    end
end

To inspect:

diagnose sys sdwan member
get system sdwan member

diagnose sys sdwan member shows administrative state, link state, gateway, and the running SLA results. A member can be admin-up but operationally out of service if its health-check is failing or its gateway is unreachable.

Health checks

config system sdwan
    config health-check
        edit "ping-google"
            set server "8.8.8.8" "1.1.1.1"
            set protocol ping
            set interval 500
            set probe-count 5
            set probe-timeout 1000
            set members 1 2
            config sla
                edit 1
                    set latency-threshold 100
                    set jitter-threshold 30
                    set packetloss-threshold 1
                next
            end
        next
    end
end

Each health-check probes from each member to each server. The result per (member, server, sla-id) is a measured latency/jitter/loss and an in-SLA / out-of-SLA boolean. A member is considered “in SLA” overall if it meets the thresholds for the SLA being referenced by the rule that matches the traffic.

diagnose sys sdwan health-check
diagnose sys sdwan health-check status
diagnose sys sdwan log

diagnose sys sdwan log is the trace log — you can see members transitioning in and out of SLA in real time. If you ever wonder “did link X actually fail at 02:13?” this is where the answer lives, alongside the FortiAnalyzer log.

Service rules

config system sdwan
    config service
        edit 1
            set name "to-saas-via-best-quality"
            set mode sla
            set internet-service enable
            set internet-service-app-ctrl 16777289   # Microsoft 365
            config sla
                edit "ping-google"
                    set id 1
                next
            end
            set priority-members 1 2
        next
        edit 2
            set name "voip-lowest-latency"
            set mode priority
            set dst "10.0.0.0/8"
            set protocol 17
            set start-port 5060
            set end-port 5060
            set link-cost-factor latency
            set priority-members 1 2 3
        next
        edit 3
            set name "default-via-cheapest"
            set mode manual
            set dst "all"
            set priority-members 1 2
        next
    end
end

Strategy modes:

  • manual — pick the first available member from priority-members. Simplest. No SLA awareness.
  • priority — pick the member with the best link-cost-factor. Factors include latency, jitter, packet-loss, bandwidth, custom-profile-1, etc. Ranks members by the chosen metric.
  • sla — pick the highest-priority member that is currently in SLA. If the top member fails SLA, fail over to the next. This is the classic “primary/secondary with failover” mode.
  • load-balance — split sessions across members. hash-mode controls how (source, source-dest, etc.).
  • auto — Fortinet’s adaptive mode that reranks members based on combined metrics.
  • maximize-bandwidth — ECMP across all in-SLA members.
  • lowest-cost — cheapest in-SLA member.
  • lowest-quality — niche; used for testing.

Service rule match criteria are richer than policy routes and include:

  • Source address / source user / source group
  • Destination address / Internet Service DB entry / Application
  • Schedule
  • Protocol, port range
  • TOS / DSCP

To inspect:

diagnose sys sdwan service
diagnose sys sdwan service 1
diagnose sys sdwan internet-service-app-ctrl-list
get system sdwan service

diagnose sys sdwan service 1 for a specific rule will show, for that rule, every priority member and that member’s current SLA state and rank. This is the truth of what the box is doing right now, irrespective of what the GUI shows.

In diag debug flow show iprope enable output you will see lines like:

msg="Match an SD-WAN service rule, id=1, find an SD-WAN member, id=1, ifindex=7"

or:

msg="No SD-WAN service rule matches, fall through to FIB"

A subtle gotcha: SD-WAN service rules require that there be an available route to the destination via the candidate member. If member 1 has no route to 8.8.8.8 (because there’s no default route via member 1’s gateway), member 1 is not eligible regardless of SLA. People forget this and then can’t understand why their failover isn’t engaging. The SD-WAN service rule does not synthesise a route; it picks among members that already have one.

Internet Service DB

The internet-service knob lets you match by FortiGuard-curated destination identity rather than IP. “Microsoft 365,” “Google,” “Salesforce” become single match objects whose underlying prefixes are kept up to date by the FortiGuard service. That’s what makes “send M365 traffic via the best-quality link” expressible in a single rule.

diagnose internet-service info
diagnose internet-service id <id>
diagnose internet-service match root <ip> <port> <proto>
get system internet-service-list

The last one — diagnose internet-service match — is enormously useful when a flow isn’t matching the rule you expect: it tells you which Internet Service entries (if any) the destination is part of, right now, on this box.

Stage 3: the FIB

If neither a policy route nor an SD-WAN service rule matched, FortiOS falls through to the FIB — the actual routing table.

The RIB / FIB on FortiOS is not unusual:

  • Connected, static, OSPF, BGP, IS-IS, RIP, BFD-influenced.
  • Per-VDOM (and within a VDOM, per-VRF if VRFs are configured).
  • Administrative distance and metric tie-breaks.
  • ECMP supported (controlled by set ecmp-max-paths and the load-balance method).
get router info routing-table all
get router info routing-table database
get router info routing-table details 8.8.8.8
get router info kernel
get router info bgp summary
get router info bgp neighbors
get router info ospf neighbor
get router info ospf database
get router info bfd neighbor
diagnose ip route list
diagnose ipv6 route list

get router info routing-table details <ip> is the canonical “which route would actually be used for this destination” query. It returns the longest-prefix-match entry plus next-hop and recursive resolution.

A few FortiOS-specific things worth knowing:

  • VRF is configured per-interface via vrf under config system interface. Routes learned on an interface are placed in the interface’s VRF. Inter-VRF leaking is done via static route with vrf set, or via BGP route targets with mp-bgp (covered in a separate post on this blog).

  • ECMP is on by default with the source-dest-ip-based load-balance method. To change:

    config system settings
        set ecmp-max-paths 16
        set v4-ecmp-mode {source-ip-based | weight-based | usage-based | source-dest-ip-based}
    end
  • Recursive lookup: a static route with a next-hop that isn’t directly connected requires the FortiGate to recursively resolve the next-hop through another route. This works but it’s a common source of subtle weirdness when the recursive path itself is via a tunnel.

  • Blackhole routes (set blackhole enable) are real entries in the FIB and are a more graceful way to drop traffic to a known-bad prefix than relying on policy.

ECMP and per-flow consistency

When the FIB returns multiple equal-cost paths, FortiOS picks one based on the configured ECMP method:

  • source-ip-based — same source IP always uses the same path.
  • source-dest-ip-based (default) — same (src, dst) pair always uses the same path.
  • weight-based — paths are weighted; selection is proportional.
  • usage-based — picks the least-loaded path.

The session entry pins the chosen egress to the session, so subsequent packets in the flow take the same path even if the ECMP method would otherwise reshuffle them. That’s important: ECMP reshuffling mid-flow breaks TCP and stresses stateful upstream devices.

Putting it together — a worked example

Suppose:

  • A user at 10.10.10.50 in VDOM root originates HTTPS to outlook.office365.com.
  • The FortiGate has two WAN members: port4 (MPLS-backed) and port5 (broadband).
  • There’s an SD-WAN service rule M365-best-quality matching the M365 Internet Service DB, mode priority, factor latency, members port4 and port5.
  • There’s a default static route via port4 (metric 1) and via port5 (metric 10).
  • No policy routes apply.

Flow:

  1. Packet arrives, session miss, RPF passes.
  2. Policy route lookup — none match. Fall through.
  3. SD-WAN service rule lookupdiagnose internet-service match root 13.107.6.152 443 6 returns “Microsoft Office 365.” Service rule M365-best-quality matches.
  4. Member selection — both port4 and port5 are in SLA. Latency: port4 = 18 ms, port5 = 11 ms. Mode priority, factor latency. port5 wins.
  5. Egress decided: port5, gateway 198.51.100.1.

diagnose debug flow ... will show:

Match an SD-WAN service rule, id=2, find an SD-WAN member, id=2 (port5), priority by latency
find a route: gw-198.51.100.1 via port5

Now invert the scenario: link quality on port5 degrades, latency goes to 90 ms with packet loss. SLA flips port5 to out-of-SLA. The next session being matched gets port4 — even though port5 might still be marginally lower latency, because the SLA gate trumps the metric. This is the difference between priority and sla modes — but both modes consult member health.

Diagnostics summary for routing decisions

# Show me the policy routes (including SD-WAN renders)
diagnose firewall proute list

# Which SD-WAN rule will match this flow?
diagnose sys sdwan service
diagnose sys sdwan service <id>

# What's the live SLA state of each member?
diagnose sys sdwan health-check
diagnose sys sdwan member

# Live SLA transitions
diagnose sys sdwan log

# Internet Service classification of a destination
diagnose internet-service match root <ip> <port> <proto>

# What does the FIB say for this destination?
get router info routing-table details <ip>
get router info routing-table all
get router info kernel

# Per-protocol diagnostics
get router info bgp summary
get router info bgp neighbors
get router info ospf neighbor
get router info bfd neighbor

# Watch the live decision in flow trace
diagnose debug flow show iprope enable
diagnose debug flow show function-name enable
diagnose debug flow trace start 50

Where we are

The packet has an egress interface and a next-hop. The session entry in the kernel knows where it’s going at L3. What it doesn’t yet know is whether it’s allowed to, and whether anything needs to be inspected, translated, or rewritten on the way out.

That’s Part 4: firewall policy match, NAT, and security profiles.