How to Troubleshoot Network Performance Issues [Full Guide]

Prepare to Troubleshoot Network Performance Issues

It’s essential to conduct network tests to pinpoint the problem’s source and then troubleshoot network performance issues like packer loss or latency. The following steps can help distinguish between network and application issues. Benchmarking performance results beforehand allows for easy comparison when observing performance issues.

Before diving into troubleshooting, ensure to check the following:

Troubleshooting Network Performance Issues on Linux

Install the following tools and follow the steps to help troubleshoot network performance problems and test your network:

Step 1. Examine traceroute or MTR reports by starting from the bottom and working upward

Begin by checking for loss or latency at the final hop or destination, then review the preceding hops. If packet loss or latency persists through the final hop, it could indicate a network or routing problem. If packet loss or latency occurs at one hop along the path, it might be due to control plane rate limiting on that node. Ensure that the last reported hop matches the intended destination in the command; if not, there may be issues caused by a restrictive security group.

Utilize the AWSSupport-SetupIPMonitoringFromVPC tool to assess performance, as it collects crucial metrics to troubleshoot network performance problems. For detailed guidance, refer to Amazon VPC’s Debugging tool for network connectivity.

Troubleshoot network performance issues on Linux by checking its performance statistics. Assess CPU, memory utilization, and load average if you have access to the source or destination instance.

Use the MTR command on Linux for continuous, real-time network performance analysis. MTR combines traceroute and ping functionalities and comes preinstalled on most Linux distributions. Alternatively, you can install it from your distribution’s software package manager.

To install MTR, follow these commands:

Amazon Linux:

Ubuntu:

To assess your network’s performance using MTR, conduct bidirectional tests between the public IP addresses of your EC2 instances and your on-premises host. Paths on a TCP/IP network may differ when the direction is reversed, so it’s crucial to gather MTR results for both directions. Consider using TCP-based tracing instead of ICMP, as many internet devices prioritize ICMP-based trace requests.

Examine packet loss carefully. Single-hop packet loss typically isn’t concerning and might result from a control plane policy dropping “ICMP time exceeded” messages. However, sustained packet loss to the destination hop or across multiple hops could indicate a problem.

Note: It’s common to see a few requests time out.

ICMP-based MTR:

TCP-based MTR:

The -T argument enables TCP-based MTR, while the –report option sets MTR to report mode. MTR will run for the specified number of cycles with the -c option, printing statistics before exiting.

Please note that TCP-based MTR tests the destination TCP port 80 by default. To specify a different destination TCP port, use the -P option followed by the port number. For example, to MTR destination TCP port 443, use the following command:

Step 2. Test performance using traceroute

The Linux traceroute tool maps the path from a client node to a destination node, recording the response time in milliseconds for each router along the way. It also calculates the time each hop takes to reach its destination.

To install traceroute, use the following commands:

For Amazon Linux:

Ubuntu:

Note: Traceroute isn’t necessary if you run an MTR report. MTR provides latency and packet loss statistics to a destination.

Ensure that port 22 or the port you’re testing is open in both directions. When troubleshooting network connectivity with traceroute, execute the command from the client to the server and from the server back to the client. Paths between nodes on a TCP/IP network can vary if the direction is reversed. Opt for a TCP-based trace (using your application port) instead of ICMP, as most internet devices deprioritize ICMP-based trace requests.

For ICMP-based traceroute:

TCP-based traceroute:

The argument -T -p 22 -n performs a TCP-based trace on port 22.

Note: You can use your application specific port for testing. Use the specific port to understand if there are any intermediate devices in the path dropping your application traffic.

Step 3. Test performance using hping3

Hping3 is a versatile command-line TCP/IP packet assembler and analyzer that measures packet loss and latency over a TCP connection. Unlike MTRs and traceroute, hping3 supports ICMP echo requests, TCP, UDP, and RAW-IP protocols. It also includes a traceroute mode for sending files between covered channels. Hping3 is valuable for scanning hosts, assisting with penetration testing, testing intrusion detection systems, and transferring files between hosts.

Unlike MTRs and traceroute, which capture per-hop latency, hping3 provides end-to-end min/avg/max latency over TCP, along with packet loss statistics. To install hping3, follow these commands:

For Amazon Linux 2, install the EPEL release package for RHEL 7, then activate the EPEL repository.

Amazon Linux 2:

Ubuntu:

The following command sends 50 TCP SYN packets over port 0. By default, hping3 sends TCP headers to the target host’s port 0, with a window size of 64 and without a TCP flag:

The following command sends 50 TCP SYN packets over port 22:

Note: Be sure that port 22 or the port that you’re testing is open.

Step 4. Test packet capture samples using tcpdump

When diagnosing packet loss or latency issues, it’s advisable to conduct simultaneous packet captures on both your EC2 instance and on-premises host. This approach enables the identification of request and response packets, aiding in the isolation of issues at the networking and application layers. To ensure comprehensive packet capture, it’s recommended to start the packet capture before initiating the traffic flow.

To install tcpdump, execute the following commands:

Amazon Linux:

Ubuntu:

After tcpdump is installed, you can run the following command to capture the tcp port 22 traffic and save it in a pcap file.

Note: The tcpdump flag -i specifies the interface on the instance where tcpdump captures the traffic. You might need to change the interface from eth0 to the configured interface in your environment.

How to Troubleshoot Network Performance on Windows

Step 1. Check for ECN capability

1.    Run the following command to determine if Explicit Congestion Notification (ECN) capability is turned on:

2.    If ECN capability is activated, run the following command to deactivate it:

3.    If you don’t see an improvement in performance, you can re-activate ECN capability using the following command:

Step 2. Review hops and troubleshoot TCP port connectivity

First, use MTR or tracert to review hops:

MTR method:

1.    Download and install WinMTR.

2.    Enter the destination IP in the Host section, and then choose Start.

3.    Let the test run for a minute, and then choose Stop.

4.    Choose Copy text to clipboard and paste the output in a text file.

5.    Look for any losses in the % column that are propagated to the destination.

Note: Ignore any hops with the No response from host message. This message indicates that those particular hops aren’t responding to the ICMP probes.

6.    Review hops on the MTR reports using a bottom-up approach. For example, check for loss on the last hop or destination, and then review the preceding hops.

Tracert method:

If you don’t want to install MTR, you can use the tracert command utility tool.

1.    Perform a tracert to the destination URL or IP address.

2.    Look for any hop that shows an abrupt spike in round-trip time (RTT). An abrupt spike in RTT might indicate that there’s a node under high load, which in turn induces latency or packet drops in your traffic.

Note: The -d option doesn’t resolve IP addresses to hostnames. Remove -d if IP to hostname resolution is required.

Then, check TCP port connectivity.

Note: Because WinMTR and tracert are both ICMP-based, you can use tracetcp to troubleshoot TCP port connectivity.

1.    Download WinPcap and tracetcp.

2.    Extract the tracetcp ZIP file.

3.    Copy tracetcp.exe to your C drive.

4.    Install WinPcap.

5.    Open the command prompt and root WinPcap to your C drive using the *C:\Users\username>cd * command.

6.    Run tracetcp using the following commands: tracetcp.exehostname:port or tracetcp.exe ip:port.

Step 3. Check the Windows Task Manager

If you can access the source or destination instance, check the Windows Task Manager for any issues related to CPU and memory utilization or load average.

Step 4. Take a packet capture

Note: When diagnosing packet loss or latency issues, it’s recommended to perform simultaneous packet captures on both your EC2 instance and your on-premises host. This allows you to identify request and response packets, isolating the issue at the networking and application layers. It’s also advisable to start the packet capture before initiating traffic to ensure all packets are captured for the flow.

  1. Install Wireshark and initiate a packet capture.
  2. Apply the following filter to isolate traffic between specific sources: (ip.addr eq source_IP) && (tcp.flags.syn == 1). This will display all TCP streams initiated by the specified source IP.
  3. Select the row corresponding to the relevant source and destination IP.
  4. Right-click and choose “Follow” > “TCP Stream” from the context menu. This will show the TCP flow between the source and destination IPs for investigation.
  5. Look for retransmissions, duplicate packets, or TCP window size notifications such as “TCP window full” or “Window size zero.” These indications may suggest that TCP buffers are running out of space.

If packet loss is detected or if the number of hops changes significantly from your benchmarks, consult your networking equipment vendor documentation. In multi-homed network environments, conduct these tests using a different ISP.