Preface
When discussing PMTU, network engineers are generally well-versed in its principles and the capabilities it brings to the network. However, for those less familiar, PMTU Discovery may seem overly ambitious, with the assumption that path MTU can be automatically detected, and thus end-to-end data transmission will naturally avoid packet fragmentation. While the concept of PMTU Discovery is sound in theory, the reality is often more complicated. Environmental factors and network complexities can sometimes hinder the successful implementation of PMTU, preventing it from functioning as smoothly as intended.
This article briefly introduces a PMTU case. Firstly, it is relatively rare in the actual environment. Either the whole process is smooth, or the path MTU problem causes packet loss and connection interruption. Secondly, it is not easy to get the actual data packet.
PMTU Concept
PMTU (Path MTU Discovery), in simple terms, is used to determine the minimum MTU (Maximum Transmission Unit) size in the end-to-end path. It is both the minimum and the maximum, so it is better to use a diagram to explain it clearly.
Each segment has an MTU, or maximum transmission unit, limit, and end-to-end transmission may pass through many nodes with different MTU limits. In this way, if the DF (Don’t Fragment) flag is set, a data packet with a length of 1500 bytes will be lost at MTU 1200, and a data packet with a length of 1100 bytes will also be lost at MTU 1000.
How to automatically detect the minimum MTU in the path? That is, how to find the minimum MTU value of 1000 in the diagram, PMTU is needed to achieve this. It detects the MTU value in the path by setting the Don’t Fragment (DF) flag in the IP header. If the MTU value of the device in the path is less than the message length and the DF flag is found, an Internet Control Message Protocol (ICMP, type 3, code 4, message ICMP_FRAG_NEEDED that requires fragmentation) will be sent back, and the message contains the acceptable MTU value. In this way, the sending application will actively reduce the MTU, and the data packets sent out can meet the minimum MTU requirements of the intermediate path .
Back to the PMTU mentioned at the beginning, it is not actually very well implemented. The simple reason is that it relies too much on ICMP. If the intermediate device does not reply to ICMP or the ICMP message is blocked by the intermediate security device, the source end cannot receive the ICMP message for various reasons. Then, naturally, it is impossible to determine the minimum MTU value. Take the TCP connection as an example. It may try to resend the data packet continuously, but if it still cannot get a response, it will disconnect.
Case Information
The basic information of the packet trace file is as follows:
λ capinfos PMTU.pcapng
File name: PMTU.pcapng
File type: Wireshark/... - pcapng
File encapsulation: Ethernet
File timestamp precision: microseconds (6)
Packet size limit: file hdr: (not set)
Number of packets: 5908
File size: 6296 kB
Data size: 6104 kB
Capture duration: 14.336218 seconds
First packet time: 2022-07-14 12:54:13.944957
Last packet time: 2022-07-14 12:54:28.281175
Data byte rate: 425 kBps
Data bit rate: 3406 kbps
Average packet size: 1033.18 bytes
Average packet rate: 412 packets/s
SHA256: ...
RIPEMD160: ...
SHA1: ...
Strict time order: False
Capture hardware: Intel(R) Core(TM) ...
Capture oper-sys: 64-bit Windows 7 Service Pack 1...
Capture application: Dumpcap (Wireshark) 3.2.18 (v3.2.18-0-gddf8072b7671)
Number of interfaces in file: 1
Interface #0 info:
Name = \Device\NPF_{...}
Description = 无线网络连接
Encapsulation = Ethernet (1 - ether)
Capture length = 262144
Time precision = microseconds (6)
Time ticks per second = 1000000
Time resolution = 0x06
Operating system = 64-bit Windows 7 Service Pack 1...
Number of stat entries = 1
Number of packets = 5908
λ
The data was captured by Wireshark v3.2.18, with 5908 packets captured, 14.3 seconds capture time, and an average rate of 3406 kbps. The capture system was Win7 SP1. There is only one flow in the session information, and the trace file may have been filtered.
The expert information looks normal overall, and the number of Warning messages is negligible.
Case Study
The packet trace file actually expands to the following:
The main analysis is as follows:
- Data packet No.1-3, TCP three-way handshake connection between Client and Server, where MSS 1400 in SYN/ACK indicates that the maximum MTU on the server is 1440 (1400+20 TCP header + 20 IP header), which means that the maximum transmission data packet length between the two parties is 1454 (1440+14 Ethernet header length); IRTT is about 103ms; supports SACK; supports WS;
- Data packet No.3, Length is 54, which is less than 60, indicating that the trace file is captured directly on the Client;
- Packet No. 6, the client sent a packet with a maximum length of 1454;
4. Data packet No.7, an ICMP message is returned from the 192.168.191.69 node, indicating that the destination is unreachable (Fragmentation needed) ;
Data packet No.7 is an ICMP message of type 3 and code 4, indicating fragmentation needed. The message contains the acceptable MTU value of 1410. The subsequent IPv4 header, TCP header and HTTP data are part of the original data packet No.6, and the relevant information can correspond one-to-one with No.6.
It should be noted that the TCP payload size is 508 bytes, which makes the entire No.7 data packet 590 bytes. After deducting the 14-byte Ethernet header length, the actual MTU of the No.7 data packet is 576, which is the minimum MTU standard value specified by the WAN.
5. After receiving packet No.7, the client learns that the next-hop MTU is 1410, so it adjusts the MSS to 1370 (1410-40) and sends packet No.9.
No.9 is 1424 bytes long (1370 + 20 + 20 + 14), and No.11-12 are also 1424 bytes long. Client Seq 230 -> 1600 -> 2970 -> 4340, and Server ACK num 4340 in No.13. So far, PMTU system implementation and data interaction are completed, and the connection is restored to normal.
It should be noted that because Client No.6 was not sent to the Server normally due to MTU problems, the Client re-adjusted the data packet according to the new MTU. Wireshark’s judgment logic here is not so smart. It will mistakenly judge No.9 as out of order and No.11 as retransmission due to No.6 Seq. In fact, they are not. You can re-observe by ignoring No.6-7 data packets, as shown below. Everything is normal.
Case summary
In daily work, due to the limitation of MTU, the problems that may arise are far more complicated than imagined, but as long as you master the basic knowledge and principles and deal with them calmly, then the problem will not be a problem.