Content Delivery Network, commonly known as CDN, comprises a geographically distributed network of edge servers (or “points of presence”, PoPs) linked to the origin server (or server where the website or web application is hosted). The main objective of which is to provide faster and reliable delivery of web content/assets by reducing the physical distance between the source of data and the end-user. This is done by making the edge servers cache a copy of the web content from the origin server. What happens here is, edge servers cache the website’s static content such as, HTML pages, style-sheets, javascript files, images, and videos, and deliver this content to end-users.
Without a CDN, origin servers will need to respond to every user’s request and will have to deliver all contents to users directly. This may result in a very high and persistent web traffic on the origin server, hence increasing the possibility of server failure, or decreasing website reliability. With CDN, since the edge servers cache a copy of the web content, it can respond to the user’s request and deliver static content to users instead of the origin server. This offloads the traffic from the origin server, hence reducing hosting bandwidth cost and server failure. Since edge servers are geographically closer to the user, this will also improve the website loading time (i.e. latency), which will have a positive effect on the website performance and ultimately the user’s experience. And since the PoPs are distributed all over the world, this will allow users from any point around the world to access the same web content, regardless of where the origin server is located.
Aside from those direct advantages, CDNs are also used to prevent service interruptions, and even improve server security. With all of these advantages, CDN became a very popular choice to relieve major pain points of traditional web hosting which benefits both the website owner and the end-users. This is also the reason why, CDN started to gain popularity in today’s world internet giants, such as Netflix, Facebook, and Amazon.
How does a CDN work?
One of the main contributing factors to website performance is distance. The closer is the distance between the end-user and the origin server, the shorter is the time that it takes for the web content to get delivered to the end-user. And distance is the factor that CDN tries to resolve. CDN significantly cuts down the distance between the “web content source” and the end-user, by distributing the edge servers around the globe and by caching the static web content from the origin server.
To see how CDN (i.e. distributing edge servers around the globe and caching static content) improves website performance, let’s see the step-by-step process of web content access using the illustration below.
In this scenario, an end-user from Asia wants to access a website that is hosted in North America, and that the website owner uses CDN. This is what happens behind the scene when the end-user tries to access the said website:
- The end-user from Asia enters the domain name “www.mlytics.com” in the address bar of the web browser.
- The browser routes the request (query) for www.mlytics.com to the Domain Name System (DNS).
- The DNS will then proceed with the name resolution (check DNS lookup – steps 3 to 8) and return the CNAME of the CDN DNS server.
- The CDN DNS will also perform its own look up and will return the IP address of the closest edge server (i.e. Asia edge server) to the DNS server.
- The DNS will return the IP address of the closest edge server (i.e. Asia edge server) to the web browser.
- Now that the web browser has the IP address, the web browser then makes an HTTP GET request to the edge server.
- When there is an edge server, the browser will always communicate with the edge server instead of the origin server. Edge servers are distributed all over the world, but the browser will communicate to the one that is located nearer to the end-user.
If the edge server has the cached static content of www.mlytics.com, the static and dynamic content will follow different paths.
The static content:
- The edge server in Asia directly sends the cached static content to the web browser (i.e. skipping steps 7 and 8). This is the main reason on how CDN speeds up the website performance.
- If it fails to locate the content, it will search for the content from other edge servers within the CDN platform. And if the content is still unavailable, the edge server will act as a reverse proxy, and send the request back to the origin server (step 7), fetch the content (step 8), cache the content to serve future requests, and finally send the content to the web browser.
The dynamic content:
- The edge server in Asia will request the dynamic content from the origin server in North America.
- The origin server in North America will then deliver the dynamic content to the edge server in Asia.
- The dynamic content will be delivered from the edge server in Asia to the web browser, and the web browser displays the webpage for the end-user.
It should be noted however, that the caching process on edge servers happens during the first request. This means edge servers initially do not have the cached static content yet. Hence, the first request will always have to fetch all the contents from the origin server before the edge server can cache the static content. This means, on the first request, both the dynamic and static contents will have to follow steps 7 to 9. This also makes the “first request” slower than the succeeding request/s.
In addition, static content that can be cached are also determined by the cache rules defined on the edge server. Depending on the rule, it may be possible that not all static content is cached.
Although dynamic content is not stored on the edge server, CDN can still assist in speeding up the delivery of dynamic content from the origin server through content compression. With content compression, files generated from the origin server (e.g. js, html, css, xml, json, and shtml) are made significantly smaller so that they can reach the client device more quickly and efficiently.
CDN Infrastructure
Edge Server or Point of Presence
Each edge server or Point of Presence (PoP) is strategically placed at the exchange points between different networks (i.e. internet exchange points, or IXP). IXPs are data centers where different Internet service providers (ISP) connect in order to provide each other access to internet traffic originating from different networks. By having a connection to these high-speed and highly interconnected locations, PoPs are able to effectively communicate with end-users within its geographic vicinity, hence reducing the round trip time for data delivery, and the cost. Each PoP typically contains numerous caching servers.
Caching server
Each PoP contains a number of caching servers. These servers main function is to store and deliver cached files (i.e. static content) to nearby end-users. By caching web content, they can reduce bandwidth consumption of the origin server and at the same time speed up website load times. Similar to typical computers, caching servers hold multiple storage and memory devices to cache files securely and with great speed.
CDN DNS Configuration
Aside from the physical infrastructures, there is also a need to modify the DNS configuration of the root domain (and its subdomains) that you want to connect to CDN. The purpose is to make CDN as your default inbound gateway for all incoming requests/traffic. Meaning, the DNS will route all visitors to the CDN instead of routing it to the origin server. Here, the activation of CDN generally follows 2 steps:
- Modify the A record of your root domain to point to one of the CDN’s IP ranges.
- Modify the CNAME record of your subdomain to point to a “CDN-assigned edge address”.
Different CDN vendors and DNS providers may have few differences on the specifics on how you need to configure your DNS for CDN activation, hence, you may need to check the step-by-step instructions provided by each CDN vendor.
Benefits of CDN
1. Improved performance and lower latency
By distributing content to major networking intersections, this brings the content closer to end-users. Hence, end-users will experience faster loading times or shorter latency. According to a Google web performance engineer, Ilya Grigorik, “Using a CDN allows us to terminate the connection close to the user, which can significantly reduce the cost of TCP and TLS handshake. For the best results, you should be using a CDN to serve both static and dynamic content.”
2. Better user experience and user retention
End-users are more likely to abandon their session from a website with slow performance. By deploying PoPs to various geographical locations, this will speed up the website access and deliver high-quality and rich multimedia content in a reliable way. Faster websites will significantly reduce bounce rates, while reliable delivery of high quality content will encourage users to spend more time exploring the website and visit it again.
3. Improved availability and resilience
If any point of presence (PoP) is down due to a huge spike in traffic or hardware failures, the request will simply be routed to the next closest or best performing PoP available. This level of redundancy ensures resilience and high availability (99.99% or more). CDN providers also manage internal failover and disaster recovery systems that auto-route traffic around downed servers. All of these features offer a 24/7 pleasant web experience for users.
4. Improved scalability and reduction of bandwidth costs
CDNs are expected to handle huge volumes of traffic. Since CDN routes traffic to the edge server, bandwidth consumption of the origin server will be greatly reduced. Not only that this will increase the amount of traffic that the website can handle, this will also greatly reduce the bandwidth consumption costs for website hosting. Through caching and other optimizations, CDNs are able to reduce the amount of data an origin server must provide, thus reducing hosting costs for website owners.
5. Improved website security
CDN acts as a reverse proxy. By having a reverse proxy in place, a request from the end-user doesn’t need to communicate directly with the origin server. This will hide the IP address of their origin server, hence making it more difficult for attackers to execute an attack. Instead, the attackers will only be able to target the CDN edge server, which generally is more secured against cyber attacks. In addition, due to the distributed nature of the edge servers, CDN can also provide essential DDoS attack mitigation. Each edge server is capable of handling high volumes of traffic, up to tens of gigabytes each second. If one edge server is flooded, CDNs can simply re-route the traffic to a healthier edge server.
6. SEO advantages
In 2010, Google announced that the website performance would begin to affect the search engine rankings. In other words, faster websites have better chances of higher SERPs. By using CDNs, you can improve your website performance and potentially get a higher ranking.