Mapping How Data Flows Across the Internet

If the internet is a collection of about 90,000 interconnected networks, and if each one of those networks had a direct connection to every other network there would be more than 4 billion individual connections. Considering that each network contains multiple computers, the number of possible connections gets very large very fast. Luckily, that's not how the internet works. Instead, a series of ingenious services and protocols take all your internet requests and route them in the most efficient way possible to where they need to go and back again so you get just what you need from the internet. Let's meet some of these services and protocols so you can better understand how data flows across the internet.

Six pieces that make the internet work

Most of what you do on the internet boils down to sending a request and receiving data in return. There are many different paths your request can take depending on how the site that should receive it is set up. For the most part, the web is set up to make the transmission of data as efficient as possible, but there are a number of factors that can interfere with that. Here's a look at six protocols and services that do their best to ensure the ideal scenario.

BGP

The Border Gateway Protocol (BGP) is the fundamental routing protocol that controls how traffic is directed across the internet. No single party controls this map that helps networks navigate where information needs to go next to get it to its final destination. BGP helps your local ISP find the path to the correct regional ISP or CDN to ensure your request is delivered to the right place on the internet.

DNS

The Domain Name System (DNS) basically translates the domain names you're used to using on the internet (think Speedtest.net) into a numerical IP address that is used by internet protocols to identify computers. In the case of a website using a CDN, the DNS server will give the IP address of the nearest CDN server, helping to make the data transfer faster and more reliable. DNS resolution is usually done by sending a request to the local ISP but can also be performed by third parties, depending on how your network is configured.

CDN

A content delivery network (CDN) is a group of geographically distributed servers and data centers that is used to deliver a lot of the content you see on the internet every day. This content can be everything from the images on a retail site, to the movie you are currently streaming. In essence, CDNs store the content you want closer to you.

CDNs are useful when you think back to the math on connecting 90,000 networks. If you're in Japan and you're trying to get content that only exists on a server in Nigeria, you're going to be waiting awhile because data cannot travel faster than the speed of light. That sounds fast, but it's not if you're waiting for data to travel across the globe. However, if that Nigerian content is mirrored on a CDN, a user in Japan (or anywhere else there is a nearby server for that CDN) has quick access to what they were looking for. As discussed above, the usage of CDNs decreases your wait time for data and makes the data you receive more reliable.

Another way CDNs bring their content closer to you is through Anycast routing. Anycast provides a way for servers in multiple locations to share a single IP address. Regardless of where you are, BGP sends you to the nearest server sharing the Anycast IP address.

ISP

We typically think of an internet service provider (ISP) as the company individuals contract with to bring internet services to our homes. This is true. ISPs also build out and maintain the infrastructure that carries internet data from one location to another, and ISPs serve as way points as data gets transferred across the internet as part of that. In fact, if your data needs to travel across the globe you'll be hitting your local ISP (who you contract with), a regional ISP (that can take the data from your locality across the country), and a global ISP (that transfers data between countries).

NAT

Network address translation (NAT) translates the private IP addresses on your network into a single public IP address before your data is sent to an external network. This was essential under IPv4 when there were only so many IP addresses. NAT is usually performed by your Wi-Fi router.

TCP

The transmission control protocol (TCP) breaks data into packets before sending them across the internet and then TCP reassembles those packets on the receiving end.

How these pieces get your data to the right place

Let's look at three examples of how the six pieces above combine to get you the data you are seeking on the internet. These are just examples as much of the routing on the internet is dynamic, responding to changes including network congestion. It's also important to note that different websites and services will have different arrangements that will change these flows.

How streaming works

How internet data flows when you stream a show

It is common for a streaming service to use a CDN that connects directly with your local or regional ISP. Netflix will actually ship ISPs cache servers that they can host in their own data centers so that popular shows can be viewed directly from a local ISP, whereas something less popular will instead go to the CDN or to the streaming provider directly.

What it looks like to visit a website with a CDN

How internet data flows when using a site with a CDN

As discussed above, many websites use CDNs to more efficiently and effectively supply the data you request. This makes a lot of web experiences more local than we think. It also makes the internet faster because the content you are requesting is closer. Behind the scenes, the CDN will either serve you a copy of the page they already have or they will reach out to the site to serve it. We have illustrated the first one above, but it's possible that a regional or even a global ISP will be involved, depending on what's stored on the CDN and what's needed at the time.

What's different when visiting a website with no CDN

How internet data flows when using a site with no CDN

You won't know if a website or service uses a CDN. But if you are requesting information from a site that does not use a CDN and you are far from the site's servers, you may notice significant lag. This is because your request and the replying data have to travel the entire distance between you and the server. As you can see from the image above, that involves many more touchpoints.

Now that you know a bit more about how data travels over the internet you can see two things. The first is how amazing the system is that transmits so much data pretty much seamlessly. The second is that there are a lot of places where things could potentially go wrong, even if for a short time, if one of these points fails. Remember this when you're troubleshooting your Wi-Fi, because sometimes the trouble you're having is with something larger that is out of your control. Check Downdetector® for alerts on larger internet and service outages and for more information about smaller ones.