As cloud product development progresses, mastering cloud products and understanding their underlying logic is an essential part of the task. This series provides a basic configuration experience and usage analysis of the AWS CloudFront product.
Too Long; Didnât Read
- What is CloudFront
- CDN principles and the problems it solves
- CloudFront basic configuration process
- tcpdump packet capture and analysis
- Conclusion
01/What is CloudFront
Here is an excerpt from the official website. CloudFront is a web service that speeds up the distribution of your static and dynamic web content, such as HTML, CSS, JS, and image files, to users. It delivers your content through a global network of data centers, referred to as edge locations.
When a user requests content that you serve with CloudFront, the request is routed to the edge location that provides the lowest latency (time delay) so that content is delivered with the best performance.
CloudFront directly translates to âcloud front end,â and it is a CDN product.
02/CDN Principles and Solutions
Principles
There is an illustrative diagram below to explain CDN principles (external reference):
>
From the above flowchart, it can be seen:
- User Initiation: The user requests the resource via http://www.test.com/1.jpg.
- DNS Resolution: The browser initiates domain name resolution with the DNS server, which identifies the CNAME configuration pointing to the CDNâs scheduling domain, and DNS recursively queries the CDNâs scheduling system.
- Scheduling Decision: The scheduling service returns the best edge location IP for access.
- Cache Check: The browser requests the 1.jpg file from the IP, entering the CDN edge location, which checks if the content is in the cache.
- Cache Hit: If the content is in the cache, the edge location directly returns it to the user.
- Cache Miss: If the content is not in the cache, the edge location requests it from the origin server (possibly going through multiple intermediary nodes to reduce origin server load).
- Cache and Transfer Content: The edge location stores the content from the origin server in the cache and delivers it to the user.
- Browser Rendering: Upon receiving content, the userâs deviceâs browser begins rendering the page.
Problems Solved
In summary, CDN services are intended to reduce the service providerâs costs (resource costs, operational costs) and enhance user experience. They achieve this by providing quick request response and resource caching with access control, edge computing, security, and other value-added capabilities through extensive, proximal edge node networks.
Cost Example:
>
Based on an example of 100G traffic per month, the cost after discounts is approximately 96.89 RMB, usually more affordable than bandwidth fees. It is less than putting in resources to operate servers and bandwidth without a CDN, requiring only some configuration for access.
03/CloudFront Basic Configuration Process
Create Distribution
Origin Configuration
Cache Configuration
Functions, WAF, Alternate Domain Name, etc. (default not configured)
Upon completion of the configuration, a domain d37z7ecg72nt7t.cloudfront.net was assigned (in this section, this domain is used directly, and further configuration of custom domain will be covered in later sections).
04/tcpdump Packet Capture and Analysis
Log into the server hosting the origin sg.lukachen.work, capture packets, and write them to a test.pcap file (capture all incoming and outgoing packets on the network interface, use Wireshark for filtering and reassembling. Note! Packet capture consumes substantial CPU and disk resources; if on a live server, execute during low load times or after setting reasonable filtering parameters and evaluation).
tcpdump -i eth0 -w test.pcap
Access resources from a local browser (or use curl), discover a 404 response â thereâs a problem⊠Troubleshoot the cause.
curl https://d37z7ecg72nt7t.cloudfront.net/1.txt
Log into the server, stop packet capture, and send the capture file to local Wireshark for analysis.
sz test.pcap -y
Use Wireshark to locate packets, filter by keyword 1.txt, and use Follow TCP Stream to reconstruct TCP packets.
From the reconstructed view, observe request headers; analysis shows the cause to be the Host in the request header not being configured on my server.
Configure it, reload nginx configuration
Request again, itâs through!
Repeat the above packet capture actions and examine the related header information again.
05/Conclusion
Well, the above is the content of this opening chapter, covering server-side packet capture and Wireshark reassembly analysis with tcpdump.
CDN is a crucial component of modern internet infrastructure, especially for large websites and services needing global content distribution. CDN is key to improving performance and user satisfaction.
In this chapter, we explored the basic concepts, principles, and basic configuration of CloudFront CDN. In subsequent chapters, we will delve into each configuration itemâs usage and packet capture analysis to further explore optimization for different business needs, demonstrated through test cases.