Finding the Hop That's Eating Your Packets: pmtud-sweeper
“The VPN is up, but apps just hang”
Anyone who has run an IPsec or SD-WAN tunnel for long enough has seen this ticket. Ping works. SSH works. Anything substantial — a SQL query, a file transfer, a TLS handshake with a fat certificate chain — stalls dead. The tunnel is fine. Routing is fine. What’s broken is the MTU.
Somewhere on the path, a hop is silently dropping packets that have the Don’t Fragment bit set and are larger than its outgoing interface MTU, and the ICMP “Fragmentation Needed” reply that should fix this (RFC 1191) never makes it back to the sender. So the host keeps retransmitting the same too-big segment forever. That’s the PMTUD black hole, and it’s especially common on tunnels because every encapsulation layer (GRE, IPsec, VXLAN, GENEVE) shaves bytes off the underlying MTU and somebody, somewhere, has a firewall that drops ICMP unreachables for “security.”
I built pmtud-sweeper to identify exactly which hop on the path is doing the clamping, without having to log into every box on the route.
What it actually does
For each TTL from 1 up to the destination, it binary-searches the largest DF-set IPv4 packet that hop will pass — using whichever probe protocol your network is willing to forward — and reports the result as a per-hop MTU table. Any hop whose MTU is lower than the previous hop’s is flagged as a bottleneck. If the bottleneck is the last reachable hop and the destination isn’t responding at all, you’ve found your black-hole router.
The search itself is the boring part: bisect between a known-good size (576 bytes) and a known-bad size (default 1500), DF set, observe whether you get a reply, an ICMP type 3 code 4 (“Fragmentation Needed, Next-Hop MTU = N”), or silence. Repeat per hop. The interesting part is the probe.
Why three probe modes (and a fourth)
A path-MTU tool with one probe type is a path-MTU tool that only works on a lab network. Real paths have firewalls with opinions. So pmtud-sweeper carries four:
ICMP echo (--probe icmp) is the obvious default. It’s cheap, every router knows how to TTL-expire it, and the reply path is symmetric. The downside: enterprise edges routinely rate-limit or outright drop ICMP, and “no reply” from an ICMP probe doesn’t always mean “this hop dropped your big packet” — it might just mean “this hop hates ICMP.” Good first probe; bad sole probe.
UDP (--probe udp) is what classic traceroute uses, for good reason. UDP to a high, unused destination port elicits an ICMP type 3 code 3 (“Port Unreachable”) from the destination and a type 11 (“Time Exceeded”) from intermediate hops, and the outgoing UDP datagram is treated as just another data packet by every hop in between — meaning your DF-set MTU test is exercising the actual data plane, not the control plane. UDP gets through ACLs that ICMP doesn’t, and crucially, the size you’re testing is paid for by a real UDP payload, so you’re measuring the same path your real traffic takes.
TCP-SYN (--probe tcp-syn) is the probe of last resort, and on tunnels-via-the-public-internet it’s often the only one that works. Many edge firewalls and CGNATs will gleefully drop both ICMP and high-port UDP but happily forward a SYN to TCP/443. Padding the SYN with TCP options up to the test size lets you measure MTU using a packet shape that absolutely will not be filtered, because if the firewall blocks TCP/443 to the destination, nothing else was going to work either. This is the probe that solves “ping works, my application doesn’t” the fastest.
End-to-end TCP MSS handshake (--probe tcp-mss) is the fourth mode and a different beast: instead of measuring per-hop, it completes a full TCP handshake to the destination on a real port, advertises a large MSS, and watches what gets clamped. This catches MSS-clamping middleboxes — Fortigates, ASAs, ScreenOS firewalls — that quietly rewrite the SYN/SYN-ACK MSS option to fit their tunnel, which is invisible to per-hop probes. If your per-hop sweep says 1500 end-to-end but tcp-mss reports 1380, you’ve found a clamper.
Between the four, the heuristic is simple: ICMP if you trust the network, UDP for accurate per-hop on internal paths, TCP-SYN over the internet, TCP-MSS to confirm what the application will actually negotiate.
What the output looks like
Run it across a tunnel and you get a Rich-formatted table: TTL, hop IP, RTT, the largest DF-set packet that made it, and a column flagging the bottleneck. The interesting row is always the one where mtu drops:
TTL Hop RTT MaxMTU Note
...
6 10.10.20.1 4.1ms 1500 ok
7 10.10.20.2 4.4ms 1500 ok
8 192.0.2.5 18.2ms 1380 BOTTLENECK (Next-Hop MTU = 1380)
9 192.0.2.6 18.4ms 1380 ok
10 10.50.1.1 19.0ms 1380 ok
Hop 8 is your culprit. That’s the IPsec headend (or its upstream CPE) clamping the path, and the RFC 1191 Next-Hop MTU value tells you exactly what to set on your end to stop the black hole — typically by adjusting tcp-mss on the Fortigate VPN interface, or lowering the underlying interface MTU.
The CLI exits 0 for clean paths and 2 when a bottleneck is detected, which makes it a drop-in CI gate for tunnel monitoring — schedule it, alert on non-zero exit, and you’ll catch the next time a carrier silently changes a path MTU on you.
Where this fits
This is the diagnostic I wished I had during every “the tunnel works but Outlook hangs” ticket I’ve ever taken. It’s also the natural companion to the Fortinet packet-flow series — the same DF-bit and ICMP-unreachable behaviour the Fortigate enforces in its kernel is what every router on the path is supposed to enforce, and this tool is the way to find the one that isn’t.
Repo, install instructions, and the test suite are on GitHub: github.com/MichealGarner/pmtud-sweeper. MIT-licensed. PRs welcome — especially for IPv6, which is the obvious next milestone.