🌐Network Performance•8 min read•6/2/2026

Why BGP Route Selection Affects Your Application Latency More Than Your Cloud Provider Admits

IDACORE

IDACORE Team

Quick Navigation

← More Network Performance ← All Network Infrastructure

Your application is slow. You've profiled the code, tuned the database indexes, added a CDN, and the p95 latency is still worse than it should be. The hyperscaler dashboard shows "healthy" across the board. So what's left?

Probably the network path your packets are actually taking — which almost certainly isn't what you'd draw on a whiteboard.

BGP routing is the mechanism that moves your traffic across the internet, and it operates on rules that have nothing to do with what's fastest for your users. Understanding why that matters — and what you can actually do about it — is the difference between chasing phantom performance problems and fixing real ones.

How BGP Actually Makes Routing Decisions

The Border Gateway Protocol was designed for policy and reachability, not speed. When a router running BGP needs to forward a packet, it selects a path based on a ranked list of attributes. Shortest AS path. Local preference. MED values. Origin type. Then, way down the list, something that approximates physical distance.

Notice what's not at the top of that list: latency.

This matters because "shortest AS path" means the fewest autonomous system hops — not the fewest physical miles, and definitely not the lowest round-trip time. A route with three AS hops might traverse 2,000 miles of fiber. A route with five AS hops might stay within 200 miles. BGP will prefer the three-hop path every time unless an operator has explicitly configured it otherwise.

Hyperscalers run massive, distributed networks with complex internal routing policies. They're optimizing for their own traffic engineering goals — redundancy, cost, capacity utilization — not for your application's latency profile. When your traffic enters their network from Oregon and needs to reach a resource "in the cloud," it may travel through multiple internal handoffs before it gets anywhere near the compute you're paying for.

The Oregon Region Problem for Pacific Northwest Users

Here's a scenario I've seen play out more times than I can count. A company based in the Treasure Valley runs their infrastructure on AWS us-west-2 — the Oregon region, which is the closest major hyperscaler footprint to Idaho. On paper, that sounds fine. Oregon is right next door.

In practice, the round-trip from a Boise office to a workload in us-west-2 typically runs 20-40ms. Sometimes more, depending on the time of day and what the internet is doing. That's not a bug — it's the expected result of the actual network path, which often involves your traffic leaving Boise, transiting one or more upstream providers, entering AWS's network at a peering point that may or may not be geographically close to you, and then traversing AWS's internal fabric to reach your instance.

Each of those handoffs is a BGP decision. Each one adds latency. And because you don't control any of those routing tables, you have no visibility into why the path looks the way it does on any given day.

Sub-5ms to a data center 85 miles away in Weiser isn't marketing — it's physics. Fewer hops, shorter fiber runs, and direct network adjacency. BGP still makes the decisions, but there are far fewer decisions to make.

What Traceroute Tells You (and What It Doesn't)

traceroute is the first tool most engineers reach for when investigating latency. It's useful, but it tells an incomplete story.

traceroute -n 8.8.8.8
 1  10.0.0.1       0.4 ms
 2  198.51.100.1   1.2 ms
 3  203.0.113.45   8.7 ms
 4  * * *
 5  72.14.232.1    22.1 ms
 6  8.8.8.8        23.4 ms

That * * * on hop 4 isn't a broken router — it's a router configured not to respond to ICMP TTL-exceeded messages. Common in carrier networks and hyperscaler infrastructure. You're seeing a gap in the path, not the path itself.

More importantly, traceroute shows you the forward path. Return traffic may take a completely different route due to asymmetric routing — which is normal in BGP networks and means your actual round-trip latency can be higher than the sum of the one-way hops you're seeing.

For a more complete picture, use mtr (Matt's Traceroute), which combines traceroute and ping into a continuous view:

mtr --report --report-cycles 100 your-server-ip

The --report-cycles 100 flag gives you 100 samples, which surfaces packet loss and jitter that a single traceroute run will miss. If you're seeing 2-3% packet loss at a mid-path hop, that's a retransmission problem that will show up as latency spikes in your application even if average latency looks acceptable.

What you can't easily see from either tool: the BGP policy decisions that determined which path your traffic took in the first place. For that, you need looking glass access to the routers involved, or a relationship with someone who runs the network and can actually tell you what's happening.

When Latency Isn't Random — It's Scheduled

One pattern that catches teams off guard: latency that's consistently worse at certain times of day. You check your application metrics, see that p95 response times spike between 8-10am and again in the early afternoon, and assume it's your own infrastructure under load.

Sometimes it is. But sometimes it's BGP convergence events or traffic engineering shifts in your upstream provider's network.

Large carriers and hyperscalers regularly make routing changes — adding capacity, shifting traffic off congested links, responding to peering disputes. These changes trigger BGP updates that ripple through the routing tables of every connected network. During convergence, paths can temporarily shift to suboptimal routes. Traffic that normally takes 8ms might take 35ms for 30-90 seconds while the dust settles.

If those convergence windows happen to align with your peak traffic periods, you'll see latency spikes that look like application problems but are actually network events you have zero control over.

The mitigation strategy here isn't complicated, but it requires visibility: monitor your network path continuously, not just your application metrics. Tools like smokeping or a commercial equivalent will show you baseline latency trends and flag anomalies that correlate with BGP events rather than application load.

What "Managed Cloud" Actually Means for Routing

There's a meaningful difference between running workloads on a hyperscaler that happens to have a region near you and running on infrastructure operated by a team that owns the network path end-to-end.

We've been running BGP since before most cloud providers existed. Our ASN and peering at the Seattle Internet Exchange aren't resume items — they're the reason we understand what's actually happening when your application has a latency problem. When a customer calls because something looks wrong with their network path, I can pull up the routing table and tell them exactly what's happening. Not open a ticket with a tier-1 support team and wait.

That matters practically in a few ways. First, we can make routing policy decisions that optimize for our customers' traffic patterns, not for our own internal cost optimization. Second, when something breaks, the person who answers knows what BGP is. Third, Idaho data residency means your traffic isn't transiting through Oregon or California datacenters on its way to your users — it stays in-state, which shortens the physical path and keeps data where compliance requirements say it needs to be.

A healthcare SaaS company that moved their core infrastructure here cut their AWS bill from roughly $40K/month to $26K — but the latency improvement was what their customers actually noticed. Their application had been running in us-west-2, serving users in Boise and the Treasure Valley with 25-30ms round trips. After the move, they were consistently under 5ms. That's not a marginal improvement. That's the difference between an application that feels local and one that feels like it's somewhere else.

The Part You Can Actually Control

You can't rewrite BGP. You can't tell AWS how to route packets inside their network. But you can make decisions that put your workloads closer to your users and reduce the number of autonomous system boundaries your traffic has to cross.

For applications serving Idaho and Pacific Northwest users, that means thinking carefully about where you actually run compute. The hyperscaler default of "pick the closest region" ignores the reality of how traffic actually flows from specific locations. Closest region on a map isn't the same as lowest latency in practice.

If you're running latency-sensitive workloads — anything real-time, anything with SLA commitments, anything where a 30ms round-trip is materially worse than a 4ms round-trip — the network path deserves the same attention you give to application architecture.

Start with measurement. Run continuous latency monitoring from your actual user locations to your actual infrastructure. Not from a synthetic probe in some cloud provider's network — from the places your users are. If you're serving Boise, measure from Boise. If the numbers are worse than they should be, work backwards through the path before you optimize anything else.

And if the path itself is the problem, the answer isn't tuning — it's proximity.

If you're running applications that serve Idaho users and you're tired of paying Oregon-region prices for Oregon-region latency, talk to us about what your network path actually looks like. We can pull the routing data and show you exactly where the milliseconds are going.