7. TCO and Risk Modeling: Beyond “Cheapest Box Wins”

At some point in every design review, someone will ask the question that sounds responsible but is often dangerously shallow:

“Which one is cheaper?”

If by “cheaper” they mean “lower line item on the vendor quote,” that’s the wrong question.

A network device is not a one-time purchase; it’s a multi-year cost and risk stream. It consumes power and cooling, it occupies space, it requires humans to babysit it, and when it fails or behaves badly, it sets money on fire in very creative ways.

This is why mature teams talk about Total Cost of Ownership (TCO) and risk, not just unit price.

And again, your smartphone metaphor is the same lesson in miniature:

  • The “cheapest phone” might save you money at the cash register.

  • But if the battery dies at noon, the screen cracks on the first drop, and apps constantly crash, the real cost is much higher: lost time, frustration, and replacements.

Let’s translate that into network language.

TCO 101: It’s Not Just the Box Price

Total Cost of Ownership (TCO) is everything you pay, directly and indirectly, for the lifetime of the technology:

Broadly:

  • CAPEX – what you pay to acquire it.

  • OPEX – what you pay to power, house, support, and operate it.

  • Operational cost – the hidden labor and incident cost of living with it.

Let’s break those down with network-specific knobs.

CAPEX: Chassis, Linecards, Optics, Licenses

This is the part everyone sees:

  • Chassis and fixed platforms – list price vs discount, number of slots, base system.

  • Linecards and modules – port densities, 100G/400G/800G, special feature cards (e.g., encrypted ports, deep buffers).

  • Optics and cables – QSFPs, DACs, AOCs; often a large percentage of total CAPEX.

  • Software and feature licenses – base OS, advanced features (MPLS, EVPN, SR, telemetry, security bundles), bandwidth tiers.

This is where Vendor A might look “cheaper” on paper:

  • Lower chassis and linecard pricing.

  • Fewer or cheaper licenses.

  • Aggressive discounts for landing the deal.

If you stop here, you’re basically buying gas-station phones because they’re on sale.

OPEX: Power, Cooling, Space, Support, Training

Then comes OPEX, the stuff that recurs every month or year:

  • Power:

    • How many watts per chassis, per linecard, per port?

    • Over 5 years, a high-power box can cost as much in electricity as its original purchase price.

  • Cooling:

    • More power = more heat = more cooling cost.

    • DCs have finite cooling capacity; an overheated box can force infrastructure upgrades.

  • Space:

    • RU per box, number of racks required.

    • More racks = more lease cost, more structured cabling, more everything.

  • Support contracts:

    • Vendor TAC contracts, advanced hardware replacement SLAs.

    • These scale with device count and feature sets.

  • Hardware sparing:

    • Spares you must stock for linecards, power supplies, and fabric modules.

    • Tied to your failure expectations and vendor RMA performance.

  • Training and enablement:

    • Courses, labs, certification paths.

    • The time your engineers spend ramping on a new NOS or architecture.

Some vendors win on CAPEX but lose badly on OPEX. Others cost more up front but sip power, fit more per rack, and come with simpler support models.

Operational Cost: Toil, On-Call, and Incidents

This is the part that’s rarely quantified explicitly but hurts the most:

  • Hours per change:

    • How long does it take to safely roll out a config change?

    • Is it automated, or are engineers hand-editing configs?

    • Do you need “war rooms” for routine maintenance?

  • On-call load:

    • How often are people paged?

    • How long do they spend triaging issues related to this platform?

  • Incident frequency and severity:

    • Number of P1/P2 incidents attributable to this tech per year.

    • Blast radius when it fails.

    • Mean time to resolve.

This is where box A and box B can look similar on a quote, but:

  • Box A generates a steady drip of weird bugs, manual work, and confusion.

  • Box B mostly behaves and integrates well with your automation, and it doesn’t keep people awake at night.

The difference becomes:

  • Burnout vs sustainability.

  • Firefighting vs engineering.

  • “We need more headcount just to keep this thing alive” vs “We can run this at scale with a small, sharp team.”

That’s OPEX in human form.

Subscribe to keep reading

This content is free, but you must be subscribed to The Routing Intent by Leonardo Furtado to continue reading.

Already a subscriber?Sign in.Not now

Keep Reading