How does data sent over the internet know where to go?

No Stupid Questions@lemmy.world – 108 points – 8 months ago

I saw a map of undersea internet cables the other day and it's crazy how many branches there are. It got me wondering - if I'm (based in the UK) playing an online game from someone in Japan for example, how is the route worked out? Does my ISP know that to get to place X, the data has to be routed via cable 1, cable 2 etc. but to get to place Z it needs to go via cable 3, 4?

You are viewing a single comment

View all comments

There are things called routers that...route traffic. A dumbed down version is routers talk to other routers to find out what they know about.

If a game server you connect to matches you with someone in Japan, your computer sends a packet with the address in Japan attached to it. Your home router probably has no clue where that is, so it goes to its upstream router and asks if they know, this process repeats until one figures it out and you get a route.

This all happens very quickly, and it's why people say the Internet routes around damage.

Your home router probably has no clue where that is, so it goes to its upstream router and asks if they know, this process repeats until one figures it out and you get a route.

That's not how that works. The router merely sends the packet to the next directly connected router.

Let's take a simplified example:

If you were in the middle of bumfuck nowhere, USA and wanted to send a packet to Kyouto, Japan, your router would send the packet to another router it's connected to on the west coast*. From your router's perspective, that's it; it just sends it over and never "thinks" about that packet again.
The router on the west coast receives the packet, looks at the headers, sees that its supposed to go to Japan and sends it over a link to Hawaii.
The router in Hawaii again looks at the packet, sees that it's supposed to go to Japan and sends it over its link to Toukyou.
The router in Toukyou then sends it over its link to Kyouto and it'll be locally routed further to the exact host from there but you get the idea.

This is generally how IP routing works; always one hop to the next.

What I haven't explained is how your router knows that it can reach Kyouto via the west coast or how the west coast knows that it can reach Kyouto via Hawaii.
This is where routing protocols come in. You can look up how exactly these work in detail but what's important is their purpose: Build a "map" of the internet which you can look at to tell which way to send a packet at each intersection depending on its destination.

In operation, each router then simply looks at the one intersection it represents on the "map" and can then decide which way (link) to send each individual packet over.
The "map" (routing table) is continuously updated as conditions change.

Never at any point do routers establish a fixed route from one point to another or anything resembling a connection; the internet protocol is explicitly connectionless.

* in reality, there will be a few local routers between the gateway router sitting in your home and the big router that has a big link to the west coast

That sounds like quite a messy and inefficient process! But I guess as long as it can be done quickly enough, it doesn't really matter?

I think the previous comment omitted something, which is why you think it's inefficient: routers don't ask for directions every packet, they record the directions in their route table.

At the back-bone scale of the internet, routers actually announce the addresses they are responsible for.
Paths are judged by how specific these announcements are. A router announcing a single IP is the preffered destination, compared to a router that announces a block that contains it. So routers will forward it to whichever router more accurately describes the destination IP.
This makes up part of the calculated Path Cost of various routes to reach a destination.
If router A tries to contact router D and knows that router B and C can both forward that packet, router A will send it to the router that announced the lowest path cost to D.

Its a lot more complicated than that, but that is how datacenters can disappear from the internet (by wrongly announcing they no longer have a path to the IPs inside the datacenter), or how a small ISP can accidentally route the entire internet through their network (by accidentally announcing extremely low path costs). Both of these have happened.
https://blog.cloudflare.com/october-2021-facebook-outage/
https://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/

So, the internet is both fragile and resilient.
It can route around damage, but cannot deal with mistakes/maliciousness above a certain "ring" of control.

So, the internet is both fragile and resilient. It can route around damage, but cannot deal with mistakes/maliciousness above a certain “ring” of control.

and this kids, is why we don't like cloudflare, and DNS services.

I'm no expert but it seems like the most efficient way with the given technology! The hops between routers are much less frantic than (I think) you're imagining.

To oversimplify, think of it like boxes in boxes where each box is a router.

Your PC is in the first small box. It says "I want to connect to [IP]" and the box says "I don't have that IP, let me ask the bigger box"

The bigger box (your ISP) says "I don't have it either, I'll ask the big box"

The big box says "I don't have it but based on the address, I know it's in this other big box"

Other big box says the same thing and sends it to another small box. That small box has the PC you're looking for and the packet is delivered!

I’m no expert but it seems like the most efficient way with the given technology! The hops between routers are much less frantic than (I think) you’re imagining.

not just this, it's also worth considering that laying cables is expensive, so you better damn well use them. A system like this also ensures a very wide range of pathing. And in turn, a very spread out use pattern.

https://www.khanacademy.org/computing/computers-and-internet/xcae6f4a7ff015e7d:the-internet/xcae6f4a7ff015e7d:routing-with-redundancy/a/internet-routing

I wouldn't call that "messy and inefficient" but you do you. I'd be curious to know what's a "clean and efficient" solution for you when it comes to routing packets around the planet :)

Ah yeah this and @MelastSB@sh.itjust.works 's comment clarify the routing table thing. Before I was assuming they just blindly forwarded stuff until one router knows where to go, but if they have a rough idea from the IP address prefix that makes more sense.

They dont have a rough idea, they have a very accurate picture of where they should send a packet based on the IP address.
Routers at the internet-backbone scale actually announce the IP addresses they are responsible for, as well as other routes (with an additional path cost added) that they can reach.
So, they match a destination IP to the most accurate IP block in their routing table (so a destination of 8.8.8.8 with 2 entries of 8.8.8.0/24 and 8.8.0.0/16, it will match the 8.8.8.0/24 route) and forward the packet to the router that announced it.

Routing at the internet scale is much smarter than routing at the home (even business) level

it's not efficient from the perspective of organization. But the thing nobody tells you here is that packets have no predefined route, they take whatever route gets them there optimally. So it's highly redundant, and very fault tolerant. When you consider that, for what it does, it's a highly efficient routing system.

To the point where you could cut an undersea cable, and traffic would still route perfectly fine, albeit probably a lot slower, assuming that isn't your only connection of course. The fact that it works it all is kind of a miracle.