NSE5 Part 3: High Availability
NSE5 Part 3: High Availability
Part 3 of the NSE5 series. FortiManager HA is similar in spirit to FortiGate FGCP but very different in mechanics — there is no virtual MAC, no shared IP, and no transparent failover. This post walks through how the cluster actually behaves and the diagnostics you’ll be asked to read on the exam.
What FortiManager HA is and isn’t
A FortiManager HA cluster is up to five units that synchronise their database with one primary that all writes flow through. Secondary units are read-only mirrors. There is no load balancing — you don’t aim a managed FortiGate at “the cluster”, you aim it at the primary’s IP. Failover is manual unless you’ve explicitly configured it otherwise.
That’s the most common misconception on the exam: people assume FortiManager HA behaves like FortiGate HA. It doesn’t.
| FortiGate HA (FGCP) | FortiManager HA |
|---|---|
| Active–passive or active–active | Primary–secondary only |
| Shared virtual MAC | No virtual MAC; each unit has its own IP |
| Automatic failover with sub-second convergence | Manual failover by default |
| Up to 4 cluster members | Up to 5 cluster members |
| Heartbeat over dedicated link | Sync over any IP-reachable interface |
Cluster requirements
- Same hardware model (or VM size).
- Same FortiManager firmware to the build number.
- Reachable peer IP between members on the configured port (TCP/5199 by default).
- Same time — NTP must be synchronised. Out-of-sync clocks cause silent log replay failures.
There is no equivalent of FGCP’s “session pickup” — the device-manager state, ADOM database, and revision history are what’s synchronised, not network sessions.
Configuration
config system ha
set mode primary
set group-id 1
set group-name "fmg-ha"
set password ********
set hb-interface "port1"
set hb-interval 5
set file-quota 4096
config peer
edit 1
set ip 10.10.20.11
set serial-number FMG-VM0000000001
next
end
end
On the secondary:
config system ha
set mode secondary
set group-id 1
set group-name "fmg-ha"
set password ********
set hb-interface "port1"
config peer
edit 1
set ip 10.10.20.10
set serial-number FMG-VM0000000000
next
end
end
What each line does:
mode—primary,secondary, orstandalone. There is no automatic election.group-id/group-name/password— must match across the cluster. Mismatched group-id is the most common reason a freshly built cluster won’t join.hb-interval— heartbeat frequency in seconds. Default is 5; the exam expects you to know that.file-quota— disk space (MB) reserved for sync data on the primary. If the secondary falls far behind and uses up the quota, sync stops. Increase to 8192 on a busy ADOM.config peer— explicit peer list. Must include the serial number of the other unit, not just an IP. This is unusual and is a frequent exam gotcha.
What syncs
The primary continuously sends the secondary:
- Device database (registered FortiGates, model devices, serial numbers).
- Policy packages and objects per ADOM.
- Provisioning templates and CLI templates.
- Scripts.
- ADOM revisions and revision history.
- Admin users, profiles, and SSO configuration.
- Most
config systemsettings except the local-only ones below.
What does not sync:
- HA configuration itself (each unit has its own).
- Hostname.
- Local interface IPs (each unit needs its own).
- Logs and reports (live FortiAnalyzer-style data, where applicable).
- Backups stored on the device.
The exam will ask “if I add a managed device on the secondary, does it appear on the primary?” — answer: no, because writes only succeed on the primary. The secondary’s GUI is read-only by design.
Monitor IPs
A monitor IP lets the cluster detect a network partition. Configure one on each unit:
config system ha
config monitored-ips
edit 1
set ip 10.10.20.1 ; the upstream gateway
set interface "port1"
next
end
end
When a unit can’t ping its monitor IP, it considers itself partitioned. By default the secondary will not auto-promote — it stays read-only — but the primary will log that it has lost peering. Auto-promotion requires explicitly enabling failover-on-IP-loss.
Manual failover
The most common case. On the current primary:
execute ha-manage demote
On the unit you want to become primary:
execute ha-manage promote
There is no “graceful” semantic — promote/demote is immediate. If both units end up thinking they’re primary (split-brain), the secondary’s database is overwritten on rejoin, so be careful which order you run the commands.
For a planned failover, the safe sequence is:
- Confirm sync status is healthy (see diagnostics below).
- Demote the current primary.
- Promote the chosen secondary.
- Repoint managed FortiGates’ FGFM target if the IP has changed (it usually has).
Repointing managed devices
Because there is no shared IP, every managed FortiGate has the primary’s IP in its central-management config. After a failover that changes the primary’s IP, every device must be told the new IP. From each FortiGate:
config system central-management
set type fortimanager
set fmg "10.10.20.11"
end
execute central-mgmt register-device <FMG-serial> ********
Or, if you’ve planned ahead, use a DNS name and let the FortiGates resolve it. The DNS approach is the production-friendly answer the exam looks for.
Diagnostics
get system ha status
diagnose ha stats
diagnose ha sync-stat
get system ha status shows the cluster summary — who’s primary, last sync time, peer reachability. diagnose ha sync-stat shows the per-table sync state and is the command to run when “the cluster says it’s healthy but the secondary is missing my recent change”.
For the heartbeat:
diagnose ha hb-info
diagnose debug application haperiod -1
diagnose debug enable
hb-info shows which interface is being used and the last heartbeat seen. The debug switch (-1 = all flags) is verbose; remember to disable when done:
diagnose debug disable
Split-brain recovery
Split-brain on FortiManager is rare but not impossible — usually after a network partition where someone manually promoted the secondary. To recover:
- Decide which unit has the canonical database (usually the one that was primary before the split).
- On the other unit (the one whose data will be lost): demote it, then re-add it as a fresh secondary. Its database is wiped on rejoin and resynced from the surviving primary.
execute ha-manage demote
execute factoryreset ; only on the unit being rebuilt
config system ha
set mode secondary
...
end
execute factoryreset is reserved for the rebuild case — it nukes everything. Don’t run it on a healthy unit.
Common exam scenarios
- “Two-unit cluster, primary fails, secondary still read-only.” Expected — failover is manual unless explicitly configured otherwise.
- “Cluster shows healthy, but a recent policy package change isn’t on the secondary.” Sync lag —
diagnose ha sync-statwill show the table that’s behind. - “Secondary cannot rejoin after firmware upgrade.” Firmware mismatch — both units must be on the same build before HA forms.
- “Peer added but cluster still split.” Wrong serial number in
config peer, orgroup-passwordmismatch.
Part 4 takes us into ADOMs — administrative domains, the multi-tenancy primitive that drives almost every other configuration choice on the device.