Before starting todayâs sharing, I would like to first recommend a very wonderful article on ZooKeeper basics.
The article is âZooKeeper Basic Knowledge Summaryâđđđđđđ
The link is: Click here.
This article introduces and explains the basic knowledge of ZooKeeper from the most fundamental aspects, and the explanation is quite comprehensive, with well-integrated graphics and text, making it easy to understand and illuminating.
1. Introduction
In the previous article, through Hong Geâs introduction and explanation, friends or fellow students should have figured out where todayâs content that Hong Ge is going to discuss and introduce is, correct, itâs about the Transport Layer of the OSI seven-layer model. It is the only one that establishes an end-to-end connection between hosts, such as TCP, UDP.
2. What is TCP?
TCP operates at the Transport Layer, which is one layer above the Network Layer.
It is a connection-oriented, reliable, byte stream-based, full-duplex communication protocol.
After receiving data packets from the upper layer, TCP adds a TCP header and performs some special processing before passing it to the Network Layer.
2.1 TCP Definition
Transmission Control Protocol (English: Transmission Control Protocol, abbreviation: TCP) is a connection-oriented, reliable, byte stream-based transport layer communication protocol defined by IETFâs RFC 793. In a simplified computer network OSI model, it performs the functions specified by the fourth layer, the Transport Layer. User Datagram Protocol (UDP) is another important transport protocol within the same layer.
3. TCP Theory
TCP provides a connection-oriented, reliable byte stream service.
Connection-oriented: Both parties need to establish a connection in advance before communication, much like making a call in real life.
- Application data is divided into data blocks that TCP deems most suitable for sending.
- Retransmission mechanism. Set a timer and wait for the acknowledgment packet.
- Checksum on header and data.
- TCP sorts the received data and then forwards it to the Application Layer.
- TCP discards duplicate data at the receiving end.
- TCP also provides flow control.
A TCP connection must go through a three-way handshake, while releasing a TCP connection requires a four-way handshake. This is due to TCPâs half-close characteristic. Because TCP connections are full-duplex, each end of the TCP needs to individually execute a shutdown. Itâs worth noting that the side actively shutting down can still receive data from the other end after sending FIN, just notifying the other that it no longer has data to send. Similarly, the passively closing side can still send data after receiving FIN until it also sends out FIN, then it stops sending data.
4. What is Connection-oriented, Connectionless?
- Connection-oriented: Connection-oriented protocols require ensuring that both parties are ready before sending data, and then proceed with communication.
- Connectionless: Connectionless protocols do not require this; they send whenever they want.
5. What is Full Duplex?
Full Duplex is a communication method where both parties can simultaneously send and receive data without needing to switch between sending and receiving like Half Duplex. In Full Duplex communication, data can be transmitted in both directions at the same time, resulting in faster communication speed and higher efficiency.
6. Correspondence between OSI and Packet Details
To make it clearer, Hong Ge brought over a diagram from the previous article for explanation and discussion.
/>
7.Specific Content of TCP Packets
From the diagram below, you can see each field captured in a TCP packet by Wireshark.
/>
8. TCP Packet Format
TCP is a connection-oriented, reliable transport protocol with a complex packet format. The TCP packet format is as follows:
The diagram above is simplified as follows:
Note: Actual TCP segments may vary depending on the length and options of the TCP header.
JavaScript Language: Copy
| Source Port (16 bits) | Destination Port (16 bits) || Sequence Number (32 bits) || Acknowledgment Number (32 bits) || Data Offset (4 bits) | Reserved (6 bits) | Flags (6 bits) | Window Size (16 bits) || Checksum (16 bits) | Urgent Pointer (16 bits) || Options (optional) || Data (optional) |
Main Field Descriptions:
- Source Port: Occupies 2 bytes, identifies the application that sent the packet.
- Destination Port: Occupies 2 bytes, identifies the application to which the packet is sent. Source Port and Destination Port each take up 16 bits, representing the source port number and destination port number; they are used to distinguish between different processes within the host, while IP addresses are used to distinguish between different hosts. The combination of source port number, destination port number, and IP addresses in the IP header can uniquely identify a TCP connection.
- Sequence Number: Occupies 4 bytes, every byte of the data stream in the TCP connection is given a sequence number. The value of the Sequence Number field indicates the sequence number of the first byte of data in this segment sent. Sequence Number: It is used to identify the data byte stream from the TCP sender to the TCP receiver, indicating the sequence number of the first data byte in the data stream in this segment; mainly used to solve the network disorder problem.
- Acknowledgment Number: Occupies 4 bytes, it is the sequence number of the first byte expected to be received in the next segment from the other party. Acknowledgment Number: A 32-bit field that contains the next sequence number that the sender expects to receive. Hence, the acknowledgment number should be one more than the last successfully received data byte sequence number. However, the acknowledgment number field being valid is contingent on the ACK flag bit (described below) being 1. Mainly used to solve the packet loss problem.
- Data Offset (Header Length): Occupies 4 bits, indicates the actual length of the TCP header. Offset: Represents the number of 32-bit words in the header, necessary because the optional fieldsâ length is variable. This field occupies 4 bits (capable of representing a header length of up to 60 bytes), meaning the TCP header can be up to 60 bytes. However, without optional fields, its normal length is 20 bytes.
- In the TCP protocol, the length of the TCP header is variable, with the minimum being 20 bytes and the maximum being 60 bytes. This is because the TCP header contains several optional fields such as TCP options, window scale, etc., whose lengths can vary, thus resulting in a variable TCP header length. The length of the TCP header is specified by the Data Offset (Header Length) field in the TCP header, representing a value based on the length calculated in units of 32-bit words. Consequently, the actual TCP header length is the value of the Data Offset (Header Length) field multiplied by 4. TCP Flags: The TCP header has 6 flag bits, where multiple can be set to 1 simultaneously, primarily used to control the TCP state machine. They are
URG
,ACK
,PSH
,RST
,SYN
,FIN
. The meaning of each flag bit is as follows:
- In the TCP protocol, the length of the TCP header is variable, with the minimum being 20 bytes and the maximum being 60 bytes. This is because the TCP header contains several optional fields such as TCP options, window scale, etc., whose lengths can vary, thus resulting in a variable TCP header length. The length of the TCP header is specified by the Data Offset (Header Length) field in the TCP header, representing a value based on the length calculated in units of 32-bit words. Consequently, the actual TCP header length is the value of the Data Offset (Header Length) field multiplied by 4. TCP Flags: The TCP header has 6 flag bits, where multiple can be set to 1 simultaneously, primarily used to control the TCP state machine. They are
- Flag bits, occupying 6 bits:
- URG: This flag indicates the validity of the TCP packetâs Urgent Pointer Field, ensuring the TCP connection is uninterrupted, and urging intermediate devices to quickly process this data.
- ACK: When this bit is 1, the âAcknowledgment Numberâ field is valid; otherwise, it is invalid. This flag indicates that the acknowledgment field is valid, meaning that the TCP acknowledgment number mentioned earlier will be included in the TCP packet; it has two values: 0 and 1, where 1 indicates the acknowledgment field is valid, and 0 means otherwise. The TCP protocol stipulates that only when ACK=1 is it effective, and it is mandatory that all packets sent after the connection is established have ACK set to 1.
- PSH: This flag bit indicates a Push operation. A Push operation refers to sending the data packet directly to the application program as soon as it reaches the receiving end, instead of queuing it in the buffer.
- RST: When this bit is 1, it signifies that an exception has occurred in the TCP connection, thus requiring a forced disconnection and re-establishment of a new connection. This flag indicates a reset request of the connection, used to reset those erroneous connections and reject erroneous or illegal packets.
- SYN: When this bit is 1, it indicates a desire to establish a connection and sets the initial value in the âSequence Numberâ field. It indicates synchronization of sequence numbers to establish a connection. The
SYN
flag bit is often used in tandem with theACK
flag bit; when a connection request is initiated,SYN
=1,ACK
=0; when the connection request is acknowledged,SYN
=1,ACK
=1. This flagâs data packet is often utilized for port scanning. The scanner sends a data-onlySYN
packet; if the responding host replies, it indicates that the host has that port. Nevertheless, given this scan style is the first handshake of TCPâs three-way handshake, a successful scan indicates that the scanned host is not very secure. A secure host would enforce a strict three-way handshake for a connection. - FIN: When this bit is 1, it indicates that the data has been completely sent and there is a desire to disconnect. It signifies the end of data transmission from the senderâs end, implying that the data transmission in both directions is finished, with no more data to send. After sending a TCP packet with a
FIN
flag bit, the connection will be disconnected. This flagâs data packet is often also used for port scanning.
- Window: Occupies 2 bytes, used for flow control. Each side of the communication declares a window to indicate its current processing capacity. This controls the rate of packet sending, neither too fast nor too slow. 16 bits for a 2-byte window size, where the maximum is 65535 (2^16-1) bytes. The receiving endâs flow control measure is represented by bytes, starting from the value indicated by the Acknowledgment Number field. It informs the sender about the permissible amount of data to send. The size is 2 bytes 65535; the option field may contain a window scale factor if permissible by both the client and server TCPs. In this example, the window size of 65535 informs the sender that from the next packet with a sequence number of 0, the receiving end can only accept a maximum length of 65535 bytes (excluding an extension option weâll discuss shortly).
- Checksum: Occupies 2 bytes, verifies whether the data packet is complete and unaltered. A 16-bit 2-byte field computed over the entire TCP segment including the TCP header and data. Itâs a mandatory field, necessarily calculated and stored by the sender, then verified by the receiver. Similar to UDP datagrams, when calculating the Checksum, a 12-byte pseudo-header should be added before the TCP segment. The format of this pseudo-header is the same as that in UDP datagrams, but the â17â in the fourth field should be changed to â6â (protocol number for TCP is 6); similarly, change the fourth fieldâs UDP length to the TCP length. Upon reception of the segment, the pseudo-header is again added for Checksum calculation. In cases of TPv6, an equivalent change in the pseudo-header is necessary. Segments with failed checksums are discarded (as fields like source IP address, source port number, or protocol may have been corrupted).
- Urgent Pointer: 16-bit 2-byte field active when the URG flag is set to 1, representing an offset, which when combined with the Sequence Number, indicates the sequence number of the last byte of urgent data.
- Options: Variable length, can be up to 40 bytes. When no options are used, the TCP header length is 20 bytes. The maximum length can be inferred from the TCP header length. The TCP header length uses a 4-bit Data Offset field calculation unit of 4 bytes, resulting in options of up to: (2^4-1)*4-20 = 40 bytes. Initially, the TCP protocol only specified one optionâthe Maximum Segment Size (MSS), the combined length of the data field and TCP header, commonly referred to as MSS. MSS informs the other partyâs TCP of âthe maximum length of the segment data field my cache can receive is MSS bytes.â
- Padding: Added to make the TCP header length a multiple of 4 bytes. Options do not add up to an integer multiple of 32 bits, hence padding bits are inserted to ensure that the TCP header is an integer multiple of 32 bits.
8.1 Table Display of TCP Packet Field Descriptions
Some people prefer tables, so Hong Ge has provided one!
Field |
Length |
Meaning |
---|---|---|
Source Port |
16 bits |
Source port, identifies which application sent the data. |
Destination Port |
16 bits |
Destination port, identifies which application receives the data. |
Sequence Number |
32 bits |
Sequence field. Each byte in the data stream of a TCP link is numbered. The value in this field indicates the sequence number of the first byte of data sent in this segment. |
Acknowledgment Number |
32 bits |
Acknowledgment number, the sequence number of the first byte expected to be received in the next segment from the other party, i.e., the sequence number of the last byte of data successfully received plus 1. This field is valid only when the ACK flag is set to 1. |
Data Offset |
4 bits |
Data offset, i.e., header length, indicating how far the start of the data in the TCP segment is from the start of the TCP segment, calculated in units of 32 bits (4 bytes). A header has a maximum of 60 bytes, but usually 20 bytes when there are no option fields. |
Reserved |
6 bits |
Reserved, must be set to 0. |
URG |
1 bit |
Indicates if the urgent pointer is valid. It informs the system that there is urgent data in this segment that needs immediate attention (equivalent to high-priority data). |
ACK |
1 bit |
Indicates if the acknowledgment number is valid. This field is valid only when the ACK bit is set to 1. The TCP protocol requires that all packets sent after the connection is established have the ACK bit set to 1. |
PSH |
1 bit |
Indicates that the receiving end should immediately deliver this segment to the Application Layer. When receiving a TCP packet with PSH=1, it should be delivered to the receiving application process promptly, rather than waiting until the entire buffer is full before delivering. |
RST |
1 bit |
Indicates a request to re-establish the connection. When RST=1, it means a serious issue occurred within the TCP connection (due to a crash or other reasons), requiring the connection to be released and a new one established. |
SYN |
1 bit |
Synchronize sequence numbers, used to initiate a connection. SYN=1 indicates this is a connection request or an acknowledgment request. |
FIN |
1 bit |
Marks the end of data transmission by the sender, used to release a connection. FIN=1 indicates that this segmentâs sender has finished sending data and requests to release the connection. |
Window |
16 bits |
Window: TCPâs flow control. The window begins at the value indicated by the Acknowledgment Number field. It is the number of bytes the receiving end expects to receive. The maximum window is 65535 bytes. |
Checksum |
16 bits |
Checksum covers the TCP header and TCP data, and is a mandatory field calculated and stored by the sender, with verification by the receiver. Adding a 12-byte pseudo-header before the TCP segment is necessary for Checksum calculation. |
Urgent Pointer |
16 bits |
Urgent Pointer, effective when the URG flag is set to 1. It signifies a means for the sender to send urgent data to the other end. The Urgent Pointer specifies how many bytes of urgent data there are in the segment, which are placed at the beginning of the segmentâs data. |
Options |
Variable |
Option field. The initial TCP protocol specifies only one optionâthe maximum segment size (MSS). MSS informs the other partyâs TCP of the maximum length of the segmentâs data field that my cache can receive, as MSS bytes. |
Padding |
Variable |
The Padding field is used to add bits to make the entire header length a multiple of four bytes. |
data |
Variable |
TCP payload. |
9. Summary
Today, a detailed introduction to the theoretical knowledge of TCP packets was provided. Itâs mostly text, making it somewhat difficult to understand. In the next article, Hong Ge plans to explain and share about TCPâs three-way handshake and four-way wave and practical experience with WireShark. Well, itâs getting late, so Hong Ge will end todayâs explanation and sharing here. Thank you for your patient reading, and I hope it was helpful to you.