Nginx is an event-driven framework, where events mainly refer to network events. Each network connection in Nginx corresponds to two network events: a read event and a write event. To deeply understand the various principles of Nginx and handle certain error scenarios in extreme conditions, one first needs to understand what a network event is.
Network Transmission
In the above diagram, for instance, Host A is a home laptop, and Host B is a server running Nginx service. When Host A sends an HTTP GET request to Host B, what events occur during this process? From the data flow section in the diagram, it shows:
An application layer sends a GET request -> It reaches the transport layer, primarily involving one thing: the browser opens a port. In the Windows Task Manager, this can be seen, recording this port and the port opened by Nginx, such as 80 or 443, into the transport layer -> At the network layer, our host’s IP and the target host’s, that is, the server, public IP is recorded -> After reaching the link layer -> Through Ethernet -> To the home router (network layer), the home router will record the next section of the ISP’s IP -> Through the WAN -> Jump to the machine where Host B is located -> The message passes through the link layer -> The network layer -> To the transport layer, the operating system in the transport layer knows it is for the process that opened port 80 or 443, and the process naturally is Nginx -> Then Nginx will process this request in its HTTP State Machine (application layer).
What role does a network packet play in the above process?
TCP Streams and Packets
The data link layer adds the source MAC address and source destination address to the Header and Footer parts of the data -> At the network layer is the public IP address of Nginx (destination IP address) and the public address of the browser (source IP address) -> At the TCP layer (transport layer), Nginx’s open port (destination port) and the browser’s open port (source port) are specified -> Then the application layer involves the HTTP protocol.
This is a packet, meaning that the HTTP protocol we send will be segmented into multiple small packets. At the network layer, this segmentation is called MTU, with each MTU of Ethernet being 1500 bytes; at the TCP layer (transport layer), it considers the largest MTU value in each middle section. At this time, each packet usually has only a few hundred bytes, and this packet size is referred to as MSS. Hence, each time an MSS smaller than this size packet is received, it is actually a network event.
Now, let’s see how many events in the TCP protocol are related to some interfaces we typically call (like Accept, Read, Write, Close)?
TCP Protocol and Non-blocking Interfaces
The event of establishing a TCP connection essentially involves sending a TCP packet, reaching Nginx through the process explained in the second part above, corresponding to a read event. For Nginx, reading a packet translates into accepting an event to establish a connection.
If it is a TCP connection readable event, it is sending a message and also considered a read event for Nginx, which translates to reading a message.
If the peer (i.e., browser) actively disconnects, it is equivalent to the Windows operating system sending an event to request closing the connection. This is also considered a read event for Nginx because it merely reads a packet.
So, what is a write event? When our browser needs to send a response to the browser, the message needs to be written to the operating system, requesting the operating system to send it over the network, which is a write event.
Such network read and write events usually have an event collector or dispatcher in Nginx or any async event processing framework. It designates a consumer for each type of event, which means the event is a producer, automatically produced from the network to our Nginx, requiring us to establish a consumer for each event. For example, the connection establishment event consumer corresponds to accepting calls, and the HTTP module will establish a new connection. There are also many read or write messages, with different methods called in the HTTP state machine during different time frames, by each consumer to handle specific scenarios.
This describes an event dispatcher and consumer, including AIO for asynchronous disk read/write events, and timer events, such as timeout checks (worker_shutdown_timeout).
Above, we introduced the sending of network packets and the corresponding network events in Nginx. For example, establishing a new connection with an Accept method corresponds to receiving a read event. Next, we analyze how Nginx receives read events during the three-way handshake using packet capture tools like Wireshark.
Firstly, we install Wireshark software, capture packets for Nginx’s IP and port, and then access pages, focusing on two main aspects in the TCP layer:
The browser first opens this page, locally opening port 1875, while Nginx launches port 8080.
The TCP layer primarily handles communication between processes.
The IP layer addresses how machines find each other.
The three-way handshake involves Windows sending a [SYN] to Nginx first, and reciprocally, the server where Nginx resides sends a [SYN] to Windows. At this stage, Nginx is unaware because the connection is still half-open. Only when this Windows server sends [ACK] again to the server where Nginx resides, the operating system of Nginx’s server will notify Nginx that we have received a read event corresponding to establishing a new connection, hence Nginx should call the Accept method to establish a new connection.
Three-way Handshake
The demonstration above using Wireshark captures shows how a regular three-way handshake triggers a read event causing Nginx to handle such a read event to establish a new connection.
This article primarily explains network events and uses packet capture to analyze Nginx network events, which is very helpful for understanding Nginx’s asynchronous processing framework, on which even OpenResty deeply relies on network events and event dispatching.