Most teams learn “high availability” by protecting what is obviously critical: spines, leafs, border routers, firewalls, links, power feeds, and route reflectors. Then they automate the network, often successfully, and quietly introduce a new single point of failure: the automation and control systems that now define how the network is operated.

That blind spot only shows up when the pressure is real.

A fabric controller becomes unreachable right in the middle of a production incident. A change must be made now. The on-call engineer asks two deceptively simple questions:

  1. “Can I touch the devices directly if the controller is down?”

  2. “If the controller is down, do I lose observability too?”

This is the deeper truth: modern networks are not just routers and switches. They’re a coupled system of data plane + control plane + management/automation plane + observability plane. If you want genuine resilience, you must engineer HA across all of them, especially the software stack that governs intent, change, and evidence.

Subscribe to keep reading

This content is free, but you must be subscribed to The Routing Intent by Leonardo Furtado to continue reading.

Already a subscriber?Sign in.Not now

Keep Reading

No posts found