Server Response Disconnection Analysis: Causes and Solutions

Background

In network troubleshooting, intermittent server response disconnection is often reported by users, as indicated in the title. Below is a simplified network topology illustrating the issue:
Client — Firewall (NAT/PAT) — Internet — Firewall — Load Balancer — Web Server

The following example originates from discussions on the Wireshark official Q&A forum.

Problem Analysis

The data capture file comes from a web server. What could be the possible problem?

Normal flow

Server Response Disconnection Analysis

A relatively short interaction process without any abnormalities.

Problem Flow

Indeed, it is full of Bad TCP and TCP RST coloring, and there are some very special problems, which are briefly analyzed as follows: ​

1. Server SYN/ACK retransmission

From the first three packets in the stream (frames 70-72), the TCP three-way handshake has actually been completed, but the server still retransmitted SYN/ACK after 1.3s, which is a very strange retransmission. Because if, as the user said, the data packet was captured on the server, then after clearly receiving the third ACK from the client, the server still believed that the three-way handshake was not completed, so it further retransmitted SYN/ACK. ​

Based on this phenomenon and previous experience, I personally think that it may be necessary to further check the specific status of this connection on the server, whether it is ESTABLISHED, etc., but the experts on the forum mentioned a problem that may cause this phenomenon TCP_DEFER_ACCEPT.

❤️ Thanks to the experts for unlocking the new world of data packets. This phenomenon in this data packet file does match this situation.

TCP_DEFER_ACCEPT (since Linux 2.4)Allow a listener to be awakened only when data arrives on the socket. Takes an integer value (seconds), this can bound the maximum number of attempts TCP will make to complete the connection. This option should not be used in code intended to be portable.If this option is enabled on a Linux server, the server will not enter the ESTABLISHED state after receiving the last ACK, but will ignore this ACK, keep the SYN_RECV state unchanged, and then wait for the client to send data. If the client data does not follow quickly, because the server is still in the SYN_RECV state, it will retransmit SYN/ACK after timeout.

In normal interaction, the client sends a GET request at about 82ms, but when a problem occurs, the client initiates a GET request at about 1.5s, which is also the reason for the above retransmission. The client’s behavior also needs to be captured and analyzed before and after load balancing to be judged, which is missing here.

2. Server RST

After receiving the client’s GET request, the server directly RSTs the connection. Here, it is necessary to check the Seq Num, Ack Num, etc. in TCP. Relative sequence numbersAfter turning off the TCP option, it is observed that the frame 74 ACK number 2106390967 sent by the client is obviously different from the 2962498563 required by frame 72 or 73, and is far outside the ISN. Therefore, the server discards this data packet and sends frame 75 RST.

However, from frame 76 to the end, the client also has a strange behavior. It still sends ACK after receiving RST. This may also be explained by the Seq Num of RST not being in the valid range and therefore ignored. ​

After the client’s GET request is repeatedly retransmitted without a response, a FIN/ACK is sent to close the connection.

Summary of the problem

The phenomenon in this case is rather strange. In the absence of sufficient data packet support, such as on the client, before and after load balancing, before and after the firewall, etc., it is impossible to determine the ultimate cause of the problem. ​

PS: This is just my guess. Excluding TCP_DEFER_ACCEPTthe problem caused by the server, it may be a load balancing problem that messes up the ACK Num conversion .