The Ultimate FortiOS CLI Reference for the NSE 4 Exam – Part 3: VPN & HA

VPN and HA are two of the highest-weighted topics on the NSE 4 exam, and both share the same diagnostic challenge: the system state you care about is maintained in daemon memory, not visible in the GUI, and the CLI output requires you to cross-reference multiple commands to reach a conclusion. This module gives you the complete toolkit and teaches you how the fields relate to each other.


Module 5: VPN Infrastructure Troubleshooting


diagnose vpn ike gateway list

Command Syntax & Architectural Impact

diagnose vpn ike gateway list
diagnose vpn ike gateway list name <phase1-name>

This command queries the IKE daemon (iked) for its in-memory record of every Phase 1 negotiation state. IKE (Internet Key Exchange) is the control-plane protocol for IPsec: Phase 1 establishes a secure authenticated channel (the IKE SA) between the two peers; Phase 2 runs inside that channel to negotiate the actual encryption parameters for data-plane traffic (the IPsec SA). diagnose vpn ike gateway list shows Phase 1 exclusively.

Each entry in the output corresponds to a configured Phase 1 object (config vpn ipsec phase1-interface). The entry persists in iked’s memory regardless of whether Phase 1 is currently up — this is how you can distinguish between “Phase 1 has never succeeded”, “Phase 1 was up and dropped”, and “Phase 1 is currently established.”

The cookies (initiator and responder) are the IKE SA identifiers — a pair of random 64-bit values exchanged during IKEv1 Main Mode or IKEv2 initial exchange. They serve the same function as TCP ports: they uniquely identify a specific IKE SA instance so that retransmits and rekeying messages can be correlated correctly.

Real-World Use Case Scenario

A site-to-site IPsec VPN to a remote branch has been reported as “down” by the branch manager. You need to determine whether Phase 1 ever negotiated successfully, whether it is currently in negotiation (mid-handshake), or whether iked has not even attempted it (suggesting a trigger or configuration problem). If Phase 1 is up but traffic is not flowing, the problem is in Phase 2 or routing. If Phase 1 is not up, the problem is authentication (PSK mismatch, certificate issue) or reachability (UDP 500/4500 blocked on path).

Live Output Breakdown

FortiGate-100F # diagnose vpn ike gateway list

name: BRANCH-LONDON
version: 2
interface: port1 7
addr: 203.0.113.1:500 -> 198.51.100.5:500
tun_id: 203.0.113.1/::203.0.113.1
network-id: 0
created: 4d21h ago
peer_notif: 0
dpd-expire: 0
auto-up: 1
natt: type= NAT-T dst= remote-port=4500 src-port=4500
IKE SA: created 1/1  established 1/1  time 0/130/240 ms
  id=2 lifetime=86400 rekey=82000 reauth=0
  ESP proposal: AES_CBC-128/SHA1/MODP_2048
initiator: ce8e7d3a1f4b2091:b829e57d3c14a9f0
  cur: initiator ce8e7d3a1f4b2091:b829e57d3c14a9f0
  life: type=seconds bytes=0 active=15482 negotiating=0
  responder: 198.51.100.5 203.0.113.1
  created: 4d21h ago expires: 18h
  DPD seq no: 53, 53
  DPD state: sendack

Key Exam Indicators

FieldWhat to look for
IKE SA: created 1/1 established 1/1Format is created N/M established P/Q. created is total SA attempts; established is successful completions. 1/1 established = Phase 1 is currently up. 1/0 established = Phase 1 was attempted but failed (authentication or proposal mismatch). 0/0 = no attempt has been made.
addr: 203.0.113.1:500 -> 198.51.100.5:500If port is 4500 instead of 500, NAT-T is active (one or both peers are behind a NAT device). IKEv2 uses UDP 4500 for all traffic after the initial IKE_SA_INIT when NAT is detected.
life: active=15482 negotiating=0active is seconds this SA has been established. negotiating is seconds currently in a renegotiation. Non-zero negotiating means rekey is in progress. If negotiating is stuck at a high value, the rekey is failing silently.
DPD seq no: 53, 53DPD (Dead Peer Detection) sent/received counts. Both numbers should be equal if the peer is responding. If sent is much higher than received, the remote peer is not responding to DPD probes — the tunnel may be a “zombie” (Phase 1 state held locally but the remote has lost the SA).
version: 2IKEv2. version: 1 = IKEv1. The negotiation process, message exchange count, and re-auth behaviour differ. IKEv2 uses fewer round trips (4 messages for initial exchange vs. 6 for IKEv1 main mode) and has built-in NAT-T and MOBIKE support.
Initiator cookie ce8e7d3a1f4b2091This cookie pair identifies the specific IKE SA instance. If Phase 1 renegotiates, the cookies change. Correlate this with Wireshark/sniffer captures of IKE UDP traffic if you need to match on-wire packets to the FortiGate’s internal state.

diagnose vpn tunnel list

Command Syntax & Architectural Impact

diagnose vpn tunnel list
diagnose vpn tunnel list name <phase2-name>

This command queries iked for Phase 2 IPsec Security Association (SA) state. Where Phase 1 is the control channel, Phase 2 SAs are the actual data-plane encryption contexts — one SA for each direction of traffic (an inbound SA and an outbound SA), each identified by a unique SPI (Security Parameter Index).

The SPI is a 32-bit value carried in every ESP or AH packet header. When the receiving FortiGate sees an inbound ESP packet, it looks up the destination IP + SPI combination to find the correct SA and derive the decryption key. SPI mismatches — where the local unit expects one SPI but the remote is sending another — cause all inbound traffic to fail decryption silently: the packets arrive, the FortiGate cannot find a matching SA, and they are dropped without any policy-level log entry.

The packet counters (enc pkts, dec pkts, enc bytes, dec bytes) are per-SA, per-direction, and reset on each Phase 2 renegotiation. Cross-referencing enc/dec counter asymmetry between the two peers is the primary method for diagnosing one-way VPN traffic.

Real-World Use Case Scenario

Phase 1 is established (confirmed by diagnose vpn ike gateway list). Users at the branch can ping the HQ FortiGate but cannot reach any servers in the HQ LAN. You suspect Phase 2 is either not established or is up in one direction only. You run diagnose vpn tunnel list on the HQ unit and the branch unit simultaneously (via separate console sessions) and compare the enc/dec packet counters for the relevant Phase 2 SA.

Live Output Breakdown

FortiGate-100F # diagnose vpn tunnel list

name=BRANCH-LONDON ver=2 serial=1 203.0.113.1:0->198.51.100.5:0 tun_id=198.51.100.5 tun_id6=::198.51.100.5
bound_if=7 lgwy=203.0.113.1:0 tun_if=ssl.root rgwy=198.51.100.5:0
proxyid=BRANCH-LONDON proto=0 sa=1 ref=4 serial=1
  src: 0:10.10.0.0-10.10.0.255:0
  dst: 0:172.16.0.0-172.16.0.255:0
  SA:  ref=6 options=18200 type=00 soft=0 mtu=1438 expire=28640
  softexpire: 28340 dst-addr=198.51.100.5 src-addr=203.0.113.1
  life: type=seconds-kilobytes bytes=0 active=15542 negotiating=0
  SPI: 00000000 0x8f2a3b01 (2401452801)
  SPI: 00000000 0x3d9f1c04 (1033412612)
  enc pkts=4218 enc bytes=3841920
  dec pkts=0 dec bytes=0
  rekey: lifetime=3600 negotiating=0
  npu_flag=12 npu_rgwy=198.51.100.5 npu_lgwy=203.0.113.1 npu_selid=0
  run_tally: 0

Key Exam Indicators

FieldWhat to look for
sa=1Number of Phase 2 SAs currently active for this proxy-id. sa=0 means Phase 2 has not negotiated — investigate Phase 2 proposal mismatch (cipher, hash, PFS group, or proxy-id mismatch).
enc pkts=4218 dec pkts=0This is the smoking gun for one-way VPN traffic. Packets are being encrypted and sent (enc pkts incrementing) but nothing is being decrypted (dec pkts=0). The remote end is either not sending traffic, sending to the wrong SPI, or the inbound SA has a different SPI than the remote’s outbound SA — the classic SPI mismatch scenario.
SPI: 0x8f2a3b01 and SPI: 0x3d9f1c04Two SPI values appear: one for the outbound SA (used when encrypting), one for the inbound SA (used when decrypting). The inbound SPI of the local unit must match the outbound SPI of the remote unit, and vice versa. If they do not match after a failed renegotiation, the SAs are out of sync.
mtu=1438The Phase 2 SA’s effective MTU after IPsec overhead is subtracted. This is the value FortiOS uses when deciding whether to fragment or send ICMP Fragmentation Needed. If this is misconfigured (e.g. 1500 instead of 1438), large packets will be silently dropped at the encryption point.
src: 0:10.10.0.0-10.10.0.255:0 and dst: 0:172.16.0.0-172.16.0.255:0The proxy-id (or traffic selector in IKEv2). Only traffic matching this source/destination range is routed through this tunnel. If a client’s IP falls outside this range, its packets will be routed via the normal routing table instead of the tunnel — and potentially forwarded in clear text.
npu_flag=12Non-zero npu_flag means this SA has been offloaded to the NP hardware for encrypt/decrypt. npu_flag=0 means software encryption — expected on VM appliances and on platforms whose kernel has disabled NP offload for this SA due to an incompatible cipher.

diagnose debug application sslvpn -1

Command Syntax & Architectural Impact

diagnose debug application sslvpn -1
diagnose debug enable

This command sets the debug verbosity for the sslvpnd daemon to its maximum level (-1). sslvpnd is a standalone process that handles the entire SSL-VPN lifecycle: TLS handshake, certificate validation, user authentication (against LDAP/RADIUS/local), group lookup, portal assignment, IP pool allocation (for tunnel mode), and web bookmarks (for web mode). It operates independently of the main firewall policy engine — SSL-VPN users are processed through sslvpnd before their traffic enters the normal policy pipeline.

The debug output is streamed directly to the console in real time. Every authentication attempt produces a structured log chain: TLS negotiation, authentication protocol selection (RADIUS/LDAP/local), bind or lookup result, group membership evaluation, portal assignment, and final accept/reject decision. Each step is annotated with a result code.

The -1 flag means “all messages at all severity levels.” For production use in quiet environments, -6 (error and above) or -3 (warning and above) produces less noise. For initial troubleshooting, -1 is the correct starting point.

Real-World Use Case Scenario

Remote users are being rejected by the SSL-VPN portal with a generic “authentication failed” message. The RADIUS server administrator insists their server is healthy and accepting requests from other services. You need to determine: (1) is sslvpnd successfully contacting the RADIUS server, (2) is the RADIUS server returning Accept or Reject, (3) if Accept, is group membership evaluation then failing (e.g. the user is not in the FortiGate’s configured SSL-VPN user group), and (4) which portal is being assigned (or failing to be assigned) after successful authentication?

Live Output Breakdown

FortiGate-100F # diagnose debug application sslvpn -1
FortiGate-100F # diagnose debug enable

[sslvpnd 1234 - SSL] Incoming connection from 203.0.113.200:54211
[sslvpnd 1234 - SSL] SSL negotiation done, peer cert: NONE
[sslvpnd 1234 - AUTH] user 'jsmith' attempting authentication via RADIUS
[sslvpnd 1234 - RADIUS] sending Access-Request to 10.10.20.10:1812 id=42
[sslvpnd 1234 - RADIUS] received Access-Accept from 10.10.20.10:1812 id=42
[sslvpnd 1234 - AUTH] user 'jsmith' authenticated successfully
[sslvpnd 1234 - GRPCHK] checking group membership for user 'jsmith'
[sslvpnd 1234 - GRPCHK] user 'jsmith' NOT found in group 'SSL-VPN-USERS'
[sslvpnd 1234 - POLICY] no portal assignment matched for user 'jsmith'
[sslvpnd 1234 - POLICY] sending deny: no portal policy matched

-- Authentication and group check pass scenario --

[sslvpnd 5678 - SSL] Incoming connection from 198.51.100.9:61200
[sslvpnd 5678 - AUTH] user 'mgarner' authenticated successfully
[sslvpnd 5678 - GRPCHK] user 'mgarner' found in group 'SSL-VPN-USERS'
[sslvpnd 5678 - POLICY] portal 'full-access' assigned to user 'mgarner'
[sslvpnd 5678 - TUNNEL] allocating tunnel IP from pool 'sslvpn-pool': 10.200.0.5
[sslvpnd 5678 - TUNNEL] tunnel established, pushing routes: 10.10.0.0/16

Key Exam Indicators

LineWhat to look for
Access-Accept from RADIUS then NOT found in groupRADIUS authentication succeeded but FortiGate group membership check failed. The RADIUS server is healthy. The problem is the SSL-VPN user group configuration on the FortiGate — either the user is not a member of the group referenced in the SSL-VPN portal policy, or RADIUS group attributes (VSA / Fortinet-Group-Name) are not being sent.
Access-Reject from RADIUSRADIUS authentication failed server-side. The credentials are wrong, the RADIUS shared secret is mismatched, or the NAS IP is not permitted on the RADIUS server. Cross-reference with RADIUS server logs.
no portal policy matchedThe user authenticated and passed group checks but no portal-policy rule in config vpn ssl web portal matched the combination of user/group/realm. The SSL-VPN portal policy (config vpn ssl settingsauthentication-rules) must explicitly map this group to a portal.
allocating tunnel IP from poolTunnel mode is active and IP assignment succeeded. If the pool is exhausted, this line reads “no IP available in pool” and the user receives a VPN-connected status with no tunnel IP — traffic never flows.
SSL negotiation done, peer cert: NONEThe client is not presenting a client certificate. If the portal requires certificate authentication, this results in a deny. If the portal allows password-only auth, this is fine.

Stopping SSL-VPN debug:

FortiGate-100F # diagnose debug application sslvpn 0
FortiGate-100F # diagnose debug disable

Note: setting sslvpn verbosity to 0 (rather than just disable) stops that daemon’s output specifically while preserving any other active debug streams.


Module 6: High Availability Cluster Mechanics


get system ha status

Command Syntax & Architectural Impact

get system ha status

This command reads state from hatalk, the HA daemon that manages the FortiGate HA cluster protocol. hatalk is responsible for: heartbeat link monitoring, master election, configuration synchronisation, session table synchronisation (for stateful failover), and split-brain prevention. get system ha status provides a snapshot of everything hatalk knows about the cluster at the time of the command.

The master election algorithm is the critical concept the exam tests. When the cluster first forms (or after a failover), the winning unit is selected by evaluating the following criteria in order, with each only acting as a tiebreaker if the previous criterion produces a tie:

  1. HA override (if config system haoverride enable): whichever unit has override set to its serial number wins unconditionally, even if its uptime is lower after a reboot.
  2. Connected monitored ports: the unit with more monitored ports in the link-failed-signal list still active wins.
  3. HA priority: configurable integer (0-255), higher value wins.
  4. Uptime: longer-running unit wins (prevents flapping on simultaneous boot).
  5. Serial number: higher serial number wins (deterministic tiebreaker).

Understanding this election order is essential: a common misconfiguration is setting equal priorities and assuming a specific unit will be primary — without override, the unit with higher uptime after a maintenance window will be primary, which may be the wrong unit.

Real-World Use Case Scenario

After a planned maintenance window, the secondary FortiGate was rebooted first and the primary second. When the primary came back up, the secondary (which had been up longer at that moment) won the election and became master. The primary is now incorrectly in slave mode. You need to verify: (1) which unit is currently master, (2) what election criteria caused this result, and (3) whether override is enabled and set correctly to force the intended primary back to master on next election.

Live Output Breakdown

FortiGate-600F # get system ha status
HA Health Status: OK
Model: FortiGate-600F
Mode: HA A-P
Group: 1
Debug: 0
Cluster Uptime: 4 days, 21:03:44
Master Selected using: Connected Monitored Ports, HA Group ID
ses sync: done
ses_pickup: enable
HA uptime:  in sync
Master:
        FW-CORE-02, serialno FG6H0E5818900002, managed_id=0
        Connected Monitored Ports: 4
        Last rebooted: Wed Apr 03 07:48:02 2024
        HA Group ID: 0
        Last FGFM heartbeat:  00:00:01
        HA Primary heartbeat up:  YES
        HA Secondary heartbeat up: YES
Slave:
        FW-CORE-01, serialno FG6H0E5818900001, managed_id=1
        Connected Monitored Ports: 4
        Last rebooted: Wed Apr 03 08:01:22 2024
        HA Group ID: 0

Configuration Status:
        FW-CORE-02(updated 1): in-sync
        FW-CORE-01(updated 1): in-sync

Key Exam Indicators

FieldWhat to look for
Master Selected using: Connected Monitored Ports, HA Group IDThis tells you which election criteria determined the current master. In this case, both units had equal monitored ports so Group ID was the tiebreaker. If you see Uptime listed here, uptime was the deciding factor — indicating the override setting is probably not enabled.
Master: FW-CORE-02 vs. intended primaryIf FW-CORE-01 is the intended primary but FW-CORE-02 is currently master, the fix is either: (a) set override enable + set priority 200 on FW-CORE-01 and set priority 100 on FW-CORE-02, then trigger a negotiation, or (b) use execute ha manage 1 to access the slave CLI and execute reboot to force a re-election.
ses sync: doneSession synchronisation is complete. If this shows syncing for an extended period, the heartbeat links may be saturated or the secondary is processing sessions faster than they can be synced.
ses_pickup: enableAfter a failover, the new primary will attempt to continue existing TCP sessions using the synchronised session table rather than resetting all connections. If ses_pickup is disable, all sessions reset on failover — important to know for exam scenarios about failover behaviour.
Last FGFM heartbeat: 00:00:01Time since the last heartbeat was received. Should always be under 1-2 seconds. If this grows, the HA heartbeat link (usually dedicated interfaces or port sharing) has a problem.
HA Primary heartbeat up: YES / HA Secondary heartbeat up: YESBoth heartbeat paths are up. If either is NO, the cluster is operating on a degraded heartbeat topology — a split-brain risk if the remaining heartbeat link also fails.
Configuration Status: in-syncConfiguration is synchronised. out-of-sync here means a change was made on the master that has not yet replicated to the slave, or a manual change was made directly on the slave (which breaks the golden rule: all config changes must go through the master).

diagnose sys ha checksum cluster

Command Syntax & Architectural Impact

diagnose sys ha checksum cluster
diagnose sys ha checksum show
diagnose sys ha checksum recalculate

FortiOS HA synchronisation works by maintaining an MD5 checksum of each configuration zone (called a “debug zone”) on both cluster members. After every configuration change on the master, hatalk pushes the modified zone to the slave and both units recompute the checksum for that zone. If the checksums match, the zone is in sync. If they diverge, the master schedules a resync for that zone.

diagnose sys ha checksum cluster shows the per-zone checksum for the local unit alongside the checksum received from the peer unit in the most recent heartbeat exchange. A mismatch in any zone identifies exactly which section of configuration is out of sync — without this command, “out-of-sync” is all you know; with it, you know which zone to investigate.

The zones correspond to FortiOS configuration objects: firewall.policy, vpn.ipsec.phase1-interface, system.interface, router.static, etc. Each zone maps to a branch of the config tree. Knowing the zone name lets you target your comparison.

Real-World Use Case Scenario

get system ha status shows out-of-sync for the slave. You have just completed a maintenance window that involved adding 15 new firewall policies, modifying 3 VPN tunnels, and updating 2 static routes. The GUI is showing an HA sync warning. You need to determine: (1) which specific configuration zone(s) diverged, (2) whether this is a normal post-change propagation delay or a genuine sync failure, and (3) whether you need to force a resync or whether hatalk will resolve it automatically.

Live Output Breakdown

FortiGate-600F # diagnose sys ha checksum cluster

==[root]
chassis-id=0 slot-id=0 box-id-code=2
is_manage_master=1

global: 4d2a1f8e3c9b0517

root
  firewall.policy: a3f2b1c4d5e60718
  firewall.address: 9c8b7a6d5e4f3021
  system.interface: 12345678abcdef01
  router.static: deadbeef01234567
  vpn.ipsec.phase1-interface: 8f7e6d5c4b3a2190
  vpn.ipsec.phase2-interface: 1a2b3c4d5e6f7080
  user.local: aaaa1111bbbb2222
  user.group: cccc3333dddd4444

==[slave slot-id=1]
chassis-id=0 slot-id=1 box-id-code=2

global: 4d2a1f8e3c9b0517

root
  firewall.policy: a3f2b1c4d5e60718
  firewall.address: 9c8b7a6d5e4f3021
  system.interface: 12345678abcdef01
  router.static: deadbeef01234567
  vpn.ipsec.phase1-interface: 8f7e6d5c4b3a2190
  vpn.ipsec.phase2-interface: FFFFFFFF00000001    <-- MISMATCH
  user.local: aaaa1111bbbb2222
  user.group: cccc3333dddd4444

Key Exam Indicators

FieldWhat to look for
Matching checksums across all zonesAll zone hashes identical between master and slave = fully synchronised. This is the desired state. get system ha status would show in-sync.
vpn.ipsec.phase2-interface: FFFFFFFF00000001 on slave vs 8f7e6d5c4b3a2190 on masterThe hashes differ for the Phase 2 configuration zone only. This tells you exactly which configuration object to compare between units. The fix is typically diagnose sys ha checksum recalculate on the slave (to rule out a stale cached hash), followed by manual comparison of the Phase 2 config on both units.
global: hash matchingThe global zone covers config system global settings (hostname, timezone, admin settings). If this mismatches, fundamental system settings diverged — usually the result of someone accessing the slave CLI directly and making a change.
is_manage_master=1This output is from the master unit (is_manage_master=1). On the slave, this shows 0. Always confirm which unit you are running the command on before interpreting the “master” vs “slave” sections of the output.
All checksums 00000000 on slave zonesIf the slave shows all-zero checksums, the slave has not yet received a full configuration sync — either the slave just joined the cluster, or the heartbeat link is broken and configuration sync has not completed. A diagnose sys ha reset-uptime followed by an HA failover to the fully-synced unit may be required.
diagnose sys ha checksum recalculateForces hatalk to recompute checksums from the actual on-disk configuration rather than using the cached in-memory values. Run this when a mismatch appears but you believe the configurations are actually identical — a daemon state inconsistency may be producing a false mismatch.

Closing Notes: Putting It All Together

The six modules in this three-part guide form an ordered diagnostic ladder. Start at the top (system health) and work down:

  1. Is the unit healthy?get system status (firmware, HA role, license), get system performance status (CPU, memory, session rate)
  2. Is the interface up at L1/L2?get system interface physical, diagnose hardware deviceinfo nic
  3. Is routing correct?get router info routing-table all (FIB), routing-table database (RIB), get system arp
  4. Are sessions being established?diagnose sys session filter + list
  5. Are packets flowing?diagnose sniffer packet (is traffic arriving?)
  6. Where is it being dropped?diagnose debug flow (which kernel function drops it?)
  7. For VPN specifically?diagnose vpn ike gateway list (Phase 1), diagnose vpn tunnel list (Phase 2 + SPI counters)
  8. For HA desync?get system ha status (role + heartbeat), diagnose sys ha checksum cluster (which zone)

Working through this ladder systematically eliminates entire categories of failure at each step — the exam constructs scenarios that require exactly this kind of structured elimination, and knowing which command answers which question is the skill that separates candidates who pass from those who don’t.


Part of the NSE4 Study Series. For IPsec VPN configuration theory, see Part 8: IPsec VPN. For SSL-VPN configuration, see Part 7: SSL VPN. For HA architecture, see Part 10: High Availability.