Adding Vendor Route-Table Parsers to route-compare, and Why the Work Lives on a Branch

Companion post: Comparing Route Tables Between Two Sources — the original write-up that introduced the tool.

Source on GitHub: the work-in-progress branch is MichealGarner/route-compare @ multi-format-support. main still ships the Excel-only version described in the companion post.

The gap the original tool left

The first version of route-compare solved a real problem — two Excel sheets in, one colour-coded report out — but it solved it on rails. Both inputs had to be .xlsx, with a prefix column or something close to it, and any other shape had to be flattened down to a spreadsheet first. That is exactly what a lot of audits look like in the wild: someone has already done the cleanup, and the tool’s job is to compare the result.

The other half of the time, that is not what they look like. A network engineer asks “is the edge Cisco’s table the same as the FortiGate aggregate?” and the inputs are a show ip route paste and a get router info routing-table all dump. Or someone is migrating off Junos and the source of truth is the live show route output, not a spreadsheet anyone curated. In all of these cases, the original tool’s first instruction is “open a spreadsheet and copy the prefixes in by hand”. That is a chore, and it is a chore that throws away most of the metadata the dump came with — admin distance, next-hops, source protocol — for nothing.

I called this out as future work in the original post under “cross-format input”: the comparison engine does not care where the prefixes came from. The branch I want to talk about now is what happens when I actually do that work.

What the branch adds

The multi-format-support branch teaches route-compare to read vendor route-table dumps directly. Five new parsers, plus the existing Excel/CSV path, plus a plaintext fallback for “I just have a list of prefixes”:

  • Cisco IOS / IOS-XE / NX-OSshow ip route and show ipv6 route. Skips the legend, ignores directly connected / [admin/metric] decorations, pulls the destination prefix.
  • Junosshow route. Recognises the + = Active Route marker and the inet.0: table-header convention.
  • FortiOSget router info routing-table all and the IPv6 equivalent. Recognises the Routing table for VRF=N header.
  • Palo Alto PAN-OSshow routing route. Recognises the flags: legend and the VIRTUAL ROUTER: block header.
  • Plaintext — one prefix per line, with comma-/semicolon-/whitespace-separated values on the same line tolerated, and # or ; as comment markers.

Format is detected automatically from a combination of file extension and content sniffing — Codes: somewhere in the file is a good Cisco hint, Routing table for VRF=N is unambiguous FortiOS — and you can override with --format-a / --format-b when the auto-detection picks the wrong parser.

Critically, the two inputs do not have to be the same format. You can compare a Cisco edge against a Juniper aggregate, or an Excel CMDB extract against a live FortiGate dump, in a single command:

python route_compare.py \
  --file-a example/cisco_show_ip_route.txt \
  --file-b example/juniper_show_route.txt \
  --label-a CiscoEdge --label-b JuniperEdge \
  --output cisco_vs_juniper.xlsx

That is the headline capability. The comparison engine — set arithmetic plus ipaddress.network.overlaps — is unchanged. The change is all in the input layer.

Mechanically, that meant pulling the parsing logic out of route_compare.py and into a new parsers.py module, adding fixtures under example/ for every supported format, and standing up a smoke test suite (test_parsers.py) so I could exercise each parser against a known-good capture without running the full comparison pipeline. That separation of concerns is itself an improvement worth keeping — the engine no longer has to know what shape its inputs arrived in — but it is also a non-trivial reshape of the project’s structure, and that brings me to the second question.

Why a branch and not a push to main

The honest short answer is “because the work was big enough that being wrong about it would hurt”. The longer answer is more interesting, because it touches on what main is for on a project like this.

The original route-compare is a single file, around 350 lines, with two third-party dependencies. Anyone who clones it can read the whole thing top to bottom in fifteen minutes. The companion blog post describes what it does and how to install it, and the tool’s contract — Excel in, colour-coded Excel out — is small and stable. That contract is what main exists to keep. If someone pulled main six months ago and wired it into a quarterly audit script, that script should keep working.

The multi-format work breaks that contract in three meaningful ways, and each of them argues against pushing it straight to main.

The first is shape. The original input layer is one pd.read_excel call. The branch’s input layer is a dispatcher that picks a parser based on file extension, content sniffing, and an optional override flag, and then runs that parser to produce a list of strings the engine can normalise. That is not a feature flag; it is an architectural change. A change of that size shouldn’t land in main until I am sure the new shape is right, and the only way to be sure of that is to live with it for a while — write a few real comparisons against it, fix the rough edges that show up, and then merge. A branch is exactly the right place to do that living-with.

The second is blast radius. Vendor CLI output is fragile to parse. Every parser in parsers.py is a regex that says “I think this is what a Cisco/Juniper/FortiOS/PAN-OS route line looks like in 2026”, and every one of those regexes will eventually meet a corner case the captures I have don’t exercise — a VRF I haven’t seen, an IPv6 unicast/anycast distinction Junos prints differently, a FortiOS version that changed the table header. When (not if) one of those parsers produces a wrong result, the failure mode is silent miscompare: the tool happily tells you two route tables agree when they don’t. That is a worse failure than the tool exploding, because nobody goes back to check. The Excel-only main doesn’t have that failure mode. Until I’m confident the parsers fail loudly and the test suite catches the obvious regressions, I’d rather the safer-but-narrower tool stay the default.

The third is truthful documentation. The README on main says “Excel in, colour-coded Excel out”. The README on the branch says “Cisco, Juniper, Fortinet, Palo Alto, Excel, CSV, plaintext”. If I push the parser work into main before it’s ready, one of those README files is lying. Either main claims a capability that doesn’t really work yet, or the branch’s README is hiding behind a default-off flag that nobody finds. Keeping the work on a branch lets each side be honest about what it is.

There is also a smaller, more pragmatic reason: the branch’s diff is large and cohesive. New file (parsers.py), new tests (test_parsers.py), six new fixture files under example/, an IMPROVEMENTS.md that pulls the “future work” section out of the blog post and into the repo. Reviewing that as one branch — even if I’m the only reviewer — is much easier than reviewing it as a string of patch commits to main, where each commit either makes the tool half-broken or is too big to have meaningful comments on. The branch is, in effect, a draft pull request that I am free to read in one sitting, push back on, and decide whether to ship.

When the branch will merge

Before this lands on main, three things need to be true.

First, the parsers need to be exercised against real captures, not just the hand-crafted fixtures. Running the tool on actual show ip route output from the lab, on actual FortiOS dumps from the firewalls in the rack, on a real Junos export — that is what shakes out the regex bugs. The fixtures are small, deliberate, and parser-friendly; production output is none of those things.

Second, the silent-miscompare failure mode needs a guard. The branch should at least surface a per-source-file count of “lines that looked like routes but were not parsed” — analogous to the existing Invalid sheet, but at the format-parser layer rather than at the prefix-normaliser layer. If a parser silently drops 12% of the lines in a show ip route dump because it didn’t recognise the format of an OSPF E2 line, the tool should say so loudly, not quietly compare a partial route table.

Third, the IMPROVEMENTS.md backlog should have at least the easiest item — --ignore-defaults — implemented, because the moment you start comparing real route tables you realise that 0.0.0.0/0 “overlapping” everything on the other side is exactly the noise you don’t want in the report.

Once those three are done, the branch becomes a routine merge to main and the README catches up to reality. Until then, it stays on a branch, and anyone who needs the multi-format capability can check it out explicitly.

Closing thought

There is a small school of thought that says short-lived feature branches are a code smell on a personal project — just push to main, you’re the only contributor. I think that’s right when the change is small enough that being wrong about it costs you a git revert. It stops being right the moment the change is shaped differently from the thing on main, or the moment a bad version of the change can be silently wrong in a way that produces confidently incorrect output. Both of those apply to the multi-format work.

The branch isn’t there to keep collaborators out. It is there to keep the version of the tool that already works for the original audience — Excel in, colour-coded Excel out — protected, while the more ambitious version of the tool earns its place. When it earns it, it gets the merge.