TCP Retransmission Troubleshooting: Diagnosing Network Delays

Introduction to TCP Retransmission Troubleshooting

TCP retransmission troubleshooting is essential for diagnosing network problems that affect performance. When a message isn’t delivered, or an acknowledgment (ACK) doesn’t arrive in time, retransmissions occur, potentially leading to slow network speeds and degraded throughput. By leveraging tools like Wireshark to analyze client or server traffic, you can uncover the underlying causes of retransmissions and resolve TCP-related issues effectively. This guide covers various diagnostic methods and solutions through practical packet capture examples.

More information

Diagnostic Process for TCP Retransmissions:

  1. Start capturing data on the corresponding port.
  2. Find the Analyze | Expert Info menu.
  3. Under Notes , look for Retransmission .
  4. Click the (+) symbol to open the retransmission list. Click each line with the mouse to see the retransmission message in the packet capture panel.
  5. Now the question is, how to locate the problem?
  6. See where the retransmissions are coming from with:
  • View the packets one by one in the Expert Info window, and check which packets are retransmitted in the packet capture panel (suitable for experienced users)
  • In the message panel, configure the display filter expert.message == “Retransmission (suspected)” to see all retransmission messages in the capture file.
  • To apply the filter, view the Limit to display filter section in the Statistics & Conversations window .

Case 1: Retransmission Across Multiple Destinations

In the following screenshot, you can see that there are multiple retransmissions, distributed across multiple servers, with the destination port number being 80 (HTTP). You can also see that the retransmissions are being sent from port 10.0.0.5, so the message is either lost on the way to the Internet, or the confirmation information is not sent back from the web server in time.

TCP Retransmission Troubleshooting1

The problem occurs on the line to the Internet. How do I know what the problem is?

  1. From the Statistics menu, open IO Graph .
  2. In this example, you can see that the link load is very low. It could be that there is a fault, or there is another highly loaded link.
  3. You can log in to the communication device or SNMP browser to view the reasons for retransmission: message loss and error . Refer to the following screenshot:
TCP Retransmission Troubleshooting2

Case 2: Retransmission to a single connection

If all retransmissions occur to the same IP and the same TCP port number, it may be caused by a slow application . See the following screenshot:

For retransmission of a single connection, perform the following operations:

  1. Open Conversations from the Statistics menu and select Limit to display filter. You can see all conversations where retransmissions occurred. In this case, it is a single conversation.
  2. As shown in the figure below, by selecting the IPv4 tab you can see which IP address the retransmission is from:

3. As shown in the figure below, select the TCP tab to see which port the retransmission is coming from:

To locate the problem, perform the following steps:

  1. Check the IO graph to make sure the link is not busy. (A busy link is characterized by traffic close to the bandwidth. For example, if the bandwidth is 10Mbps and the traffic is close to 10Mbps in the IO graph, this indicates that the link is highly loaded. A link that is not busy will have many ups and downs, peaks, and idle gaps in IO).
  2. If the link is not busy, then it is possible that the server has a problem with IP address 10.1.1.200 (10.90.30.12 sends the vast majority of retransmissions, so it is possible that 10.1.1.200 is responding slowly)
  3. From the message panel, we can see that the application is FTP data. It is possible that the FTP server is working in active mode. Therefore, the connection is opened on port 2350, and the server changes the port to 1972, so the retransmission may be caused by the slow FTP software response problem.

Case 3: Retransmission mode

An important consideration when looking at TCP retransmissions is whether you can see some pattern in the retransmissions. In the following screenshot, you can see that all the retransmissions are coming from a single connection, between the client and the server’s NetBIOS session service (TCP port 139).

Looks like a simple server/client issue, but looking at the packet capture panel, it looks like this:

You can see that retransmissions always occur periodically every 30ms. The problem is that the client is executing financial processes in the software, causing the software to slow down every 30-36ms.

Case 4: Application unresponsiveness causes retransmission

Another possible reason for retransmissions is that the client or server is not responding to the request. In this case, you will see five retransmissions, and the time will gradually increase. After five consecutive retransmissions, the sender considers the connection broken (in some cases, a reset is sent to close the connection, depending on the software implementation). After the connection is broken, two things can happen:

  • Send a SYN request to the client to open a new connection. In this case, the user will see the application freeze and then start working again after 15-20 seconds.
  • Do not send SYN, the user needs to re-run the application (or part of the application)

The following diagram shows opening a new connection:

Case 5: Retransmission due to delay variation

TCP can fully tolerate delays, provided that the delay size does not change. When the delay changes, retransmissions will occur. How to diagnose whether this is the cause:

  1. The first thing is to ping the destination address and get the first message to check the communication link delay.
  2. Check the delay variables, which may be caused by the following reasons:
  • Caused by unstable or busy communication links . In this case, you can see the latency of the ping command vary, usually due to low bandwidth.
  • Due to application overload or insufficient resources , in this case, many retransmissions occur only for this application.
  • Communication device overload (CPU, cache) causes delay. Check method: Connect the communication device directly.
  1. Use the Wireshark tool to diagnose latency issues.

If retransmissions reach 0.5 percent, performance will degrade and disconnects will occur at 5 percent. It depends on the application and its sensitivity to retransmissions.

Locating TCP retransmission issues

When you see retransmissions on the communication link, perform the following steps:

  1. Locate the problem – is it a specific IP address, a specific connection, a specific application, or something else.
  2. Check if the problem is due to the communication link, packet loss, slow server or PC. Check if the application is slow.
  3. If it is not due to the above reasons, check the delay changes.

Working Principle of TCP Retransmissions:

For details about TCP sequence number/confirmation mechanism, please refer to the previous article: Network Basics (X): A detailed explanation of TCP confirmation mechanism . So what causes retransmission? When the message confirmation information is lost, or the ACK does not arrive in time, the sender will perform the following two steps:

  1. Send the message again
  2. Reduced throughput.

For more information about TCP retransmission, please refer to the previous article: Network Basics (IX): A Detailed Explanation of TCP Retransmission .

The following diagram shows how retransmissions reduce sender throughput (thin red line):