A Day in the Life of a Packet on a 50G FortiGate, Part 2: Stateful Inspection, Session Lookup, and Anti-Spoofing
A Day in the Life of a Packet on a 50G FortiGate, Part 2: Stateful Inspection, Session Lookup, and Anti-Spoofing
Recap. In Part 1 the packet arrived on a 50 Gbps-class FortiGate (1800F-class with NP7), got DMA’d into NP7 memory, was checked against the on-chip session cache, and either took the express lane out the egress port (offloaded) or got punted into the kernel. Today picks up on that punt.
If you came up on traditional firewalls — Cisco ASA, PIX, IOS ZBFW, even Palo Alto — much of this will feel familiar in shape and surprising in detail. FortiOS’s stateful engine has its own personality, and a few of the differences (helpers behaviour, RPF strictness, the iprope chain layout, conserve mode) are common sources of “why is this packet vanishing” tickets.
The big picture, slow path edition
Once the kernel has the packet, the order of operations for a new flow is roughly:
- IP header integrity (length, version, checksum).
- IPsec inbound decapsulation — if this is encrypted IPsec on a phase-2 SA, decap happens here and we re-enter the path with the inner packet.
- DoS policy check (if any DoS policy is attached to ingress).
- Conserve-mode check.
- Session table lookup.
- Hit, established: state machine update, fast-forward.
- Hit, half-open or being torn down: state machine update.
- Miss: new-session path begins.
- Anti-spoof / Reverse Path Forwarding (RPF) check.
- Routing decision (covered in Part 3).
- SD-WAN service rule match (covered in Part 3).
- Firewall policy match (covered in Part 4).
- UTM dispatch (covered in Part 4).
- NAT decision (covered in Part 4).
- Session install + NPU offload eligibility evaluation (covered in Part 5).
- Egress (covered in Part 5).
Part 2 covers steps 1 to 6.
IP integrity and IPsec inbound
fnsysctl and diag will not show you the IP integrity check happening — it’s silent unless it fails. If the packet’s IP header is malformed, total length is wrong, version is unrecognised, IHL contradicts options length, the packet is dropped here and the only artefact is a counter on the relevant interface or in diag npu np7 anomaly-drop-counter (because the NP7 catches most of these before the kernel even sees them).
If the packet is ESP destined for the FortiGate and matches a phase-2 SA, the kernel decrypts (or pushes the SA index to the CP9 for hardware decrypt) and then recursively re-enters the forwarding path with the inner packet — but the ingress interface is now the IPsec virtual tunnel interface, not the physical interface the ESP arrived on. This is hugely important when you’re writing policies and routes: VPN traffic is matched against the tunnel interface name, not the WAN port.
diagnose vpn ike gateway list
diagnose vpn tunnel list
diagnose vpn tunnel name <tunnel> stat
get vpn ipsec tunnel summary
diagnose vpn ipsec status
diagnose vpn ipsec esp ...
If a tunnel is up at IKE but no traffic is passing, diagnose vpn tunnel list is the first place to look. The two counters that matter are npu_flag (is the SA offloaded to NP7/CP9?) and the byte counters in each direction (are we encrypting and decrypting traffic?).
DoS policy
If you have a DoS policy attached to the ingress interface, the packet hits it before session lookup. DoS policies in FortiOS check anomalies (TCP SYN flood, UDP flood, ICMP flood, IP errors, port scans) per source, per destination, or globally, with a threshold and an action (pass | block | proxy). They run in the kernel and they cost CPU on every packet that matches the rule, which is why you don’t put DoS policies on busy core links unless you mean it.
diagnose ips anomaly list
diagnose ips anomaly status
diagnose sys dos-policy list
get firewall DoS-policy
A DoS policy that quietly blackholes an attacker is fantastic; one that’s set above your real traffic ceiling silently rate-limits production. Always tune thresholds in monitor mode first.
Conserve mode
FortiOS has a memory pressure regime called conserve mode. When kernel memory crosses a configurable threshold (default red is 88%, extreme is 95%), the box enters conserve mode and starts shedding load: new sessions can be refused, kernel proxies (the AV proxy, the SSL proxy) start failing closed, and management responsiveness drops. From a packet’s point of view: a packet that would have started a new session under normal load may be dropped here.
diagnose hardware sysinfo conserve
diagnose sys session stat
diagnose hardware sysinfo memory
If conserve mode: on ever shows up during a triage, you have a bigger problem than the specific session you were investigating. Check session counts, top processes (diagnose sys top), and whether IPS or AV signature memory has ballooned.
Session table lookup — the heart of the slow path
Every packet that survives the above arrives at the session lookup. FortiOS hashes the 5-tuple (plus VDOM ID and a few other discriminators) and probes the session table.
The session table is the central data structure. Everything about an active flow lives in its session entry:
- 5-tuple (or 7-tuple including VDOM and tun_id)
- State (TCP state, UDP “established,” ICMP echo state)
- Original and translated source/destination (NAT)
- Ingress and egress interface (or tunnel interface)
- Policy ID that matched
- UTM profile pointers (IPS, AV, web filter, app control, DLP, file filter)
- Helper attached (FTP, SIP, RTSP, PPTP, TFTP, etc.)
- NPU offload state (
offload=8/8,no_ofld_reason=...) - TTL / timeout
- Byte and packet counters in each direction
You can list and filter the table with extreme precision:
diagnose sys session filter clear
diagnose sys session filter src 10.1.1.10
diagnose sys session filter dst 8.8.8.8
diagnose sys session filter dport 443
diagnose sys session filter proto 6
diagnose sys session list
Or by vdom:
diagnose sys session filter vd <id>
To clear:
diagnose sys session clear
(That clears whatever the current filter selects. Be careful.)
For statistics:
diagnose sys session stat
diagnose sys session full-stat
full-stat prints session counts by state, by protocol, half-open, ephemeral, perm, NPU-offloaded, and so on. On a healthy 1800F you expect the vast majority of sessions to be npu_state non-zero.
Reading a session entry
A session entry is dense. Here’s an annotated example:
session info: proto=6 proto_state=01 duration=120 expire=3596 timeout=3600 flags=00000000 ...
state=may_dirty npu synced
statistic(bytes/packets/allow_err): org=14823/45/1 reply=87234/68/1
hook=post dir=org act=snat 10.1.1.10:54321->8.8.8.8:443(203.0.113.5:54321)
hook=pre dir=reply act=dnat 8.8.8.8:443->203.0.113.5:54321(10.1.1.10:54321)
src_mac=00:0c:29:aa:bb:cc dst_mac=00:0c:29:dd:ee:ff
misc=0 policy_id=12 auth_info=0 chk_client_info=0 vd=0
serial=00abcdef tos=ff/ff app_list=0 app=0
dd_type=0 dd_mode=0
npu_state=0x4000
npu info: flag=0x81/0x81, offload=8/8, ips_offload=0/0, epid=160/162, ipid=160/162, ...
no_ofld_reason:
Things to read off it:
proto_state=01for TCP means SYN_SENT seen + SYN-ACK seen + ACK seen — i.e. fully ESTABLISHED. (00=NONE,01=ESTABLISHED,02=SYN_SENT,03=SYN_RECV,04=FIN_WAIT,05=TIME_WAIT,06=CLOSE,07=CLOSE_WAIT,08=LAST_ACK,09=LISTEN — proto_state values are well-documented in the FortiOS handbook.)hook=post dir=org act=snat ... ->(203.0.113.5:54321)is the post-routing SNAT translating the inside source to the WAN address.hook=pre dir=reply act=dnatis the symmetrical reverse for return traffic.policy_id=12is the policy that matched. To see it:show firewall policy 12ordiagnose firewall iprope show 100004 12.npu info: ... offload=8/8is the gold star — both directions in hardware.no_ofld_reason:is the diagnostic field that explains why a session is not offloaded, when it isn’t. Empty here, which is what you want.
Helpers, ALGs, and why your FTP session won’t go fast
If the kernel decides this session needs a helper — typically because the destination port is the well-known port for a protocol with secondary connections (FTP/21, TFTP/69, PPTP/1723, SIP/5060, RSH/512, RTSP/554, MMS/1755, MGCP, H.323) — it attaches the helper to the session. Helpers do two things: they parse the application protocol to spot embedded IP/port references (for NAT fixups) and they install pinhole sessions for the secondary connections (e.g. FTP data on a randomly negotiated port).
A session with a helper attached typically cannot be fully offloaded to NP7 until the helper is satisfied, because the NP7 can’t parse the application payload. You’ll see this as no_ofld_reason: helper. For high-rate FTP or SIP, this is a real bottleneck.
To audit:
show system session-helper
config system session-helper
show
end
If you don’t actually use FTP fixup (e.g. all your FTP is FTPS or you’re not doing NAT on FTP traffic), removing the helper entry can dramatically improve offload rates for those flows. Don’t remove SIP helper unless you’ve done your homework — voice will break in interesting ways.
diagnose sys session list | grep -i helper
diagnose test application sessionhelper 0 # general health
RPF — Reverse Path Forwarding
For new sessions only, FortiOS does an anti-spoof / RPF check. The kernel takes the source address of the inbound packet and asks “if I sent a packet to this source address right now, would it leave on the same interface this packet arrived on?” If yes, the packet is allowed; if no, it’s dropped as spoofed.
Three RPF modes per VDOM:
config system settings
set strict-src-check {enable | disable}
end
- Loose (default for most platforms) — any matching route to the source address is acceptable, regardless of interface. Useful when you have asymmetric routing.
- Feasible — there must be a route to the source via the ingress interface, but it doesn’t have to be the best route.
- Strict — there must be a best route to the source via the ingress interface. This is the strictest and is what
strict-src-check enableproduces.
This is the single most common cause of “packet gets to the FortiGate but is silently dropped.” diag debug flow will tell you:
msg="reverse path check fail, drop"
If you see that, your RPF mode is incompatible with your routing. The fix is one of: (a) advertise the source’s network back across the ingress interface so RPF passes, (b) loosen RPF mode, (c) for individual interfaces, set src-check disable per-interface to opt that interface out.
config system interface
edit "port3"
set src-check disable
next
end
Doing that on a WAN-facing interface in the wild is a giant red flag, by the way. The right answer is almost always to fix the routing.
For IPv6:
config system settings
set strict-src-check enable / disable
end
(IPv6 RPF behaviour is symmetrical to v4 in FortiOS; the same modes apply.)
To see which routes RPF will use:
get router info routing-table all
get router info kernel
diagnose ip rtcache list # cached forwarding entries
State machine details that matter on busy boxes
A few state-machine behaviours to be aware of when the session table is large:
-
TCP half-open timeout is governed by
tcp-halfopen-timerinconfig system global. Default 10 seconds. SYN floods that get past DoS sit here briefly before being aged out. -
TCP idle timeout is governed per-protocol service entry and per-session-ttl override. Default TCP timeout is 3600 s.
-
UDP timeout is 180 s by default. SIP, RADIUS, and DNS often need higher.
-
ICMP timeout is 30 s.
-
session-ttlis a global table that lets you override timeouts per service, src, or dst:config system session-ttl config port edit 1 set protocol 6 set start-port 22 set end-port 22 set timeout 7200 next end end -
session-pickupin HA: when a session is created on the active unit, FortiOS replicates it to the standby. On failover the standby has the session and traffic continues without re-establishment. Withoutsession-pickupenable, every flow has to renegotiate on failover.
diagnose sys session full-stat
diagnose sys session list | grep -i synced
diagnose sys ha checksum show
Watching it in diag debug flow
For a brand-new TCP SYN, the slow-path trace looks like this (annotated, abbreviated):
received a packet(proto=6, 10.1.1.10:54321->8.8.8.8:443) from port3. <- ingress
allocate a new session-00abcdef <- session miss, new
in-[port3], out-[] <- not yet routed
find a route: flag=04000000 gw-203.0.113.1 via port4 <- routing decision
find SNAT: IP-203.0.113.5(from IPPOOL), port-54321 <- NAT decided
checking SDWAN service rule: 3 matched <- SD-WAN match
iprope_in_check() check failed on policy 0, drop <- (hypothetical, not this case)
Allowed by Policy-12: SNAT <- policy match, NAT yes
trace_id=1 ... msg="enter fast path" <- session built, NPU push
If RPF fails it appears between received a packet and allocate a new session:
reverse path check fail, drop
If session lookup hits an existing session, you get:
Find an existing session, id-00abcdef, original direction
and then the rest of the trace is the state machine update, not a fresh forwarding decision.
Summary
By the end of stateful inspection, the kernel knows whether this is a known flow it can simply update, or a brand-new flow it needs to fully classify. If it’s a new flow, the packet is now sitting at the routing decision: where should this go, and via which interface?
That is the entirety of Part 3 — and it’s a much bigger conversation than you’d think, because on a FortiGate the order of operations is policy routes, then SD-WAN service rules, then the FIB, and SD-WAN itself is a many-layered onion.