Understanding Data Packet Transmission: How Packets Navigate Networks and Return to Source

Compiled from various sources on the internet, the content focuses on Data packet transmission.

First, I encountered a problem: during the journey of a data packet from our computer, passing through multiple layers of switches and routers to reach the target server, what changes occur to the data packet? How is it transmitted step by step, and how does it return?

Data flow diagram

We first need to understand some basic concepts

Network Models

OSI Seven-Layer Model

The OSI (Open Systems Interconnection) reference model, developed by the International Organization for Standardization (ISO), is a standard framework for interconnecting computer or communication systems.

OSI Seven-Layer Model

Application LayerDHCP · DNS · FTP · Gopher · GTP · HTTP · IMAP4 · IRC · NNTP · NTP · POP3 · RPC · RTCP · RTP · RTSP · SIP · SMTP · SNMP · SSH · SDP · SOAP · STUN · SSDP · TELNET · XMPPPresentation LayerHTTP/HTML · FTP · Telnet · ASN.1 (with presentation layer functionalities)Session LayerADSP · ASP · H.245 · ISO-SP · iSNS · NetBIOS · PAP · RPC ·RTCP · SMPP · SCP · SSH · ZIP · SDP (with session layer functionalities)Transport LayerTCP · UDP · TLS · DCCP · SCTP · RSVP · PPTPNetwork LayerIP (IPv4 · IPv6) · ICMP · ICMPv6 · IGMP · IS-IS · IPsec · BGP · RIP · OSPF · ARP · RARPData Link LayerWi-Fi (IEEE 802.11) · WiMAX (IEEE 802.16) · ATM · DTM · Token Ring · Ethernet ·FDDI · Frame Relay · GPRS · EVDO · HSPA · HDLC · PPP · L2TP · ISDN · STPPhysical LayerEthernet card · Modem · Power line communication (PLC) · SONET/SDH (Synchronous Optical Networking) ·G.709 (Optical Transport Network) · Optical fiber · Coaxial cable · Twisted pair

TCP/IP Four-Layer Model

The TCP/IP protocol stack is the reference model used by the Advanced Research Projects Agency Network (ARPANET) of the United States Department of Defense and its successor, the Internet. ARPANET was a research network sponsored by the United States Department of Defense. Initially, it only connected four universities within the United States. In the years that followed, it connected hundreds of universities and government departments through leased telephone lines. Eventually, ARPANET developed into the largest interconnected network in the world—the Internet. The original ARPANET was permanently shut down in 1990.

The ISO’s OSI reference model was criticized for being too large and complex. In contrast, the TCP/IP protocol stack, developed by technical personnel themselves, has been more widely used.

TCP/IP Four-Layer Model

Application Layer

  • DHCP (Dynamic Host Configuration Protocol)
  • DNS (Domain Name System)
  • FTP (File Transfer Protocol)
  • Gopher (The Internet Gopher Protocol as defined by RFC-1436)
  • HTTP (Hypertext Transfer Protocol)
  • IMAP4 (Internet Message Access Protocol 4)
  • IRC (Internet Relay Chat)
  • NNTP (Network News Transport Protocol, RFC-977)
  • XMPP (Extensible Messaging and Presence Protocol)
  • POP3 (Post Office Protocol 3)
  • SIP (Session Initiation Protocol)
  • SMTP (Simple Mail Transfer Protocol)
  • SNMP (Simple Network Management Protocol)
  • SSH (Secure Shell)
  • SSL: Secure Sockets Layer Protocol
  • TELNET (TELNET Protocol)
  • RPC (Remote Procedure Call Protocol, RFC-1831)
  • RTCP (RTP Control Protocol)
  • RTSP (Real-Time Streaming Protocol)
  • TLS (Transport Layer Security Protocol)
  • SDP (Session Description Protocol)
  • SOAP (Simple Object Access Protocol)
  • GTP (GPRS Tunneling Protocol)
  • STUN (Simple Traversal of UDP over NATs)
  • NTP (Network Time Protocol)Transport Layer
  • TCP (Transmission Control Protocol)
  • UDP (User Datagram Protocol)
  • DCCP (Datagram Congestion Control Protocol)
  • SCTP (Stream Control Transmission Protocol)
  • RTP (Real-time Transport Protocol)
  • RSVP (Resource ReSerVation Protocol)
  • PPTP (Point to Point Tunneling Protocol)Network Layer
  • IP: (IPv4 · IPv6) Internet Protocol
  • ARP: Address Resolution Protocol, used to find the physical address from an IP address.
  • RARP: Reverse Address Resolution Protocol allows a physical machine in a local area network to request its IP address from a gateway server’s ARP table or cache.
  • ICMP: Internet Control Message Protocol used for sending control messages between IP hosts and routers.
  • ICMPv6:
  • IGMP: Internet Group Management Protocol used in the Internet Protocol family for reporting group membership to routers.
  • RIP: Routing Information Protocol, a standard protocol for exchanging routing information between gateways and hosts.
  • OSPF: Open Shortest Path First
  • BGP: Border Gateway Protocol, used for routing between autonomous systems on the Internet
  • IS-IS: Intermediate System to Intermediate System Routing Protocol
  • IPsec: A framework for secure Internet Protocol communications by using cryptographic security services.Data Link Layer  802.11 · 802.16 · Wi-Fi · WiMAX · ATM · DTM · Token Ring · Ethernet · FDDI · Frame Relay · GPRS · EVDO · HSPA · HDLC · PPP · L2TP · ISDN  Physical Layer  Ethernet physical layer · Modem · PLC · SONET/SDH · G.709 · Optical fiber · Coaxial cable · Twisted pair

Relationship between OSI Seven-Layer and TCP/IP Four-Layer

  1. OSI introduced the concepts of services, interfaces, protocols, and layering, which influenced the formation of the TCP/IP model.
  2. OSI first developed a model and then protocols, whereas TCP/IP emerged through establishing protocols and applications first, then defining the model, referring to the OSI model.
  3. OSI is a theoretical model, while TCP/IP has been widely adopted as the standard for network interconnection.

Packet Encapsulation and Analysis

Encapsulation

Applications generate and send data, which is encapsulated layer by layer, ultimately forming an Ethernet frame that is sent through the link. During communication, each protocol layer adds a header to the data, a process known as encapsulation.

Encapsulation Process

Decapsulation

The data packet reaches the target machine and goes through the reverse process, ultimately being received by the target application.

Decapsulation Process

Message Structure at Each Layer

TCP Datagram

TCP Datagram StructureTCP Datagram Structure

Wireshark Capture Data

Datagram

IP Datagram

IP Datagram StructureIP Packet Structure

Wireshark Capture Data

Datagram

Ethernet Frame

Ethernet Frame Structure

Preamble: A length of 7 bytes or 56 bits with alternating 0s and 1s for clock synchronization. Start Frame Delimiter (SFD): This symbol (1 byte: 10101011) indicates the start of the data, signaling that it cannot be used for clock synchronization anymore. The sequence of two 1s marks the transition from the preamble. Destination address and source address signify the source and destination MAC addresses. Type: This field defines the upper-level protocol frame encapsulated in the packet, such as IP, ARP, or OSPF. Data: Originates from the previous layer, with a size between 46 to 1500 bytes. If less than 46 bytes, zeros are added; if more, segmentation is required. CRC Error Detection: Verifies the source and destination MAC addresses and data sum; the frame is discarded if errors are found.

Ethernet Encapsulation

Wireshark Capture Data

Datagram

Data Packet Transmission

Transmission Diagram

Transmission Diagram

Transmission of Ethernet Frames at the Data Link Layer

TCP Data Transmission Process

TCP Three-Way Handshake and Four-Way Termination

Establishing Connection

First Handshake: The client sends a SYN packet with a designated server port and an initial sequence number X, stored in the sequence number field of the packet header. Second Handshake: The server sends back an acknowledgment (ACK) packet. Both the SYN and ACK flags are set to 1, and the acknowledgment number is set to the client’s ISN plus 1, i.e., X+1. Third Handshake: The client sends another ACK packet, setting the SYN flag to 0 and the ACK flag to 1, placing the server’s received ACK number +1 in the acknowledgment field, and writing ISN +1 in the data section.

Terminating Connection

First Termination: The client sends a FIN packet to close the connection from the client to the server, entering the FIN_WAIT_1 state. Second Termination: Upon receiving the FIN, the server sends an ACK to the client, confirming the sequence number as received plus 1 (consistent with SYN, as one FIN occupies a sequence number), entering the CLOSE_WAIT state. Third Termination: The server sends a FIN to close the server-to-client data transmission, entering the LAST_ACK state. Fourth Termination: After receiving the FIN, the client enters the TIME_WAIT state, subsequently sending an ACK back to the server. The ACK confirms the number as received plus 1, transitioning the server into the CLOSED state, completing the four-way termination.

Network Devices and Network Structure

First, have an understanding of what constitutes switches, routers, and gateway devices, among others, as well as typical network structures.

Switch

This refers to Ethernet switches. The architecture of Ethernet switches entails each port being directly connected to a host, generally operating in full-duplex mode. The switch can simultaneously connect multiple port pairs, allowing each pair of communicating hosts to transmit data without conflict as if they were the sole users of the communication medium.

Ethernet switches operate at the data link layer (Layer 2) of the OSI reference model. They recognize the MAC (Media Access Control) address and forward Ethernet data frames based on these MAC addresses.

On receiving a data frame from a connected computer at a port, the switch checks the destination MAC address in the frame header, looks up the MAC address table, and forwards the frame out through the corresponding port to achieve data exchange.

Router

A router is a hardware device that connects two or more networks, functioning as a gateway. It reads every data packet’s address and decides on forwarding paths intelligently.

Routers can also be referred to as gateway devices. They function at the network layer (Layer 3) of the OSI reference model, handling the storage and forwarding of data packets between different networks. Data transmission from one subnet to another can be processed through the routing capabilities of a router. In network communications, routers determine network addresses and select IP paths, allowing the creation of flexible link systems within different network environments by linking subnets through various data packet forwarding and media access techniques. Routers operate by processing information received from the source station or other relevant routers based on the network layer.

Routers can only forward data based on specific IP addresses, which consist of network and host addresses. Communication between computers can therefore only occur with identical network addresses. If communication is needed with computers on other subnetworks, the data must be forwarded through a router.

Routers can connect multiple network segments with each port’s IP address and network address matching that of the connected segment. Different ports have different network addresses, corresponding to different segments, enabling hosts in each network segment to send data via their segment’s IP address to the router.

For every received data packet, the router recalculates the checksum and writes new physical addresses.The router’s main function is to determine the optimal path to deliver each frame effectively to its target destination from a series of network routes.

Gateway

A gateway is also known as an inter-network connector or protocol converter. The default gateway connects networks at the network layer to achieve network integration and is the most complex networking device. It is only used for different protocol interconnections at high-level networks. Its structure resembles that of a router, but with different interconnection layers. Gateways can be used to integrate wide area networks or local area networks.

Explanation: Due to historical reasons, many documents related to TCP/IP referred to routers used at the network layer as gateways. Today, in many LANs, the term gateway usually refers to the router’s IP!

So, what exactly is a gateway? Essentially, it is an IP address that a network uses to connect to other networks. For instance, imagine Network A with an IP address range from “192.168.1.1 to 192.168.1.254” and a subnet mask of 255.255.255.0; and Network B with an IP range from “192.168.2.1 to 192.168.2.254,” using the same subnet mask. Without routers, there is no TCP/IP communication between these networks, even if both are connected to the same switch or hub. The TCP/IP protocol, based on the subnet mask, considers the hosts in different networks. To establish communication between two networks, they must pass through a gateway. If Network A’s host finds that the destination of a packet is not on the local network, it forwards the packet to its gateway, which then forwards it to Network B’s gateway, which in turn sends it to a specific host in Network B.

Only by correctly setting up the gateway’s IP address can the TCP/IP protocol facilitate communication between different networks. Which machine’s IP is it? The gateway’s IP is that of a device with routing capabilities—such as a router, a server with a routing protocol enabled (essentially acting as a router), or a proxy server, also acting as a router.

Difference between Gateway and Router

Firstly, a “gateway” is a broad concept and not limited to a specific product. Any device that connects two different networks can be called a gateway. On the other hand, a “router” typically refers to a specific category of products capable of performing routing searches and forwarding functions, making routers clearly capable of functioning as gateways.

The default gateway set up on a PC is not a physical product but a concept at the network layer. PCs do not have the capability for routing and addressing, so PCs must send all IP packets to a default transit address for forwarding—the default gateway. This gateway can be implemented on a router, a Layer 3 switch, a firewall, or a server, making it unrelated to physical devices.

Home Router

Home Router = Router + Firewall + Switch = Firewall + Switch + NAT

Firewall: A router serves as a basic firewall in many respects by automatically rejecting certain incoming data not corresponding to ongoing exchanges between internal computers and the external world. On the other hand, should a port probe occur from an unknown address, your router acts as a guard, refusing requests and concealing your computer effectively.

Gateway: A home router also functions as a network switch, hardware that facilitates communication between computers within an internal network. Without its switching capacity, devices could communicate with the broader internet through the router but not with each other.

NAT: We know that a home router can allow multiple devices to connect to the internet simultaneously. When devices send requests through a home router, the router must identify which device’s request the response belongs to when it arrives. When acquiring internet access rights from a network provider (ISP), the ISP assigns a public IP to the router, whereas internal devices use only a private IP. The function of NAT is to convert between public and private IPs and ports. For this, a table is needed to record the mapping relationship between internal and external IPs and ports

Consider two devices, A and B, within the internal network. Suppose both access the same external IP using the same port. The router then records the following mapping relationship:

(remote ip_r: port_r)–(local ip_a: a_port)(remote ip_r: port_r)–(local ip_b: b_port)

Suppose, coincidentally, that a_port and b_port are the same. In that case, when the response from the remote side reaches the router, the router is unsure whether to send the request back to A or B. To address this, NAT uses a triplet to distinguish: (remote ip_r: port_r)(nat port)(local ip_a: a_port).<|disc_score|>