When all else fails, your network baseline can be one of the most crucial pieces of data you have when troubleshooting slowness on the network. For our purposes, a network baseline consists of a sample of traffic from various points on the network that includes a large chunk of what we would consider “normal” network traffic. The goal of the network baseline is to serve as a basis of comparison when the network or devices on it are not acting correctly.
For example, consider a scenario in which several clients on the network complain of slowness when logging in to a local web application server. If you were to capture this traffic and compare it to a network baseline, you might find that the web server is responding normally but that the external DNS requests resulting from external content embedded into the web application are running twice as slowly as normal.
You might have noticed the slow external DNS server without the aid of a network baseline, but when you are dealing with subtle changes, that may not be the case. Ten DNS queries taking 0.1 seconds longer than normal to process is just as bad as one DNS query taking 1 full second longer than normal, but the former is much harder to detect without a network baseline.
Because no two networks are alike, the components of a network baseline can vary drastically. The following sections provide examples of the components of a network baseline. You may find that all of these items apply to your network infrastructure or that very few of them do. Regardless, you should be able to place each component of your baseline inside one of three basic baseline categories: site, host, and application.
Site Baseline
The purpose of the site baseline is to gain an overall snapshot of the traffic at each physical site on your network. Ideally, this would be every segment of the WAN.
Components of this baseline might include the following:
1.Protocols in use
Use the Protocol Hierarchy Statistics tab while capturing traffic from all of the devices on the network segment at the network edge (router/firewall), so that you can see traffic from all devices. Later, you can compare against this to find out if normally present protocols are missing or if new protocols have introduced themselves on the network. You can also use this to find above ordinary amounts of certain types of traffic based on protocol.
2.Broadcast traffic
This includes all broadcast traffic on the network segment. Sniffing at any point within the site should let you capture all of the broadcast traffic, allowing you to know who or what normally sends a lot of broadcast traffic out on the network, so you can quickly determine whether you have too much (or not enough) broadcasting going on.
3. Authentication sequences
These include traffic from authentication processes on random clients to all services, such as Active Directory, web applications, and organization specific software. Authentication is one area where services are commonly slow. The baseline allows you to determine if authentication is to blame for slow communications.
4. Data-transfer rate
This usually consists of a measure of a large data transfer from the site to various other sites in the network. You can use the capture summary and graphing features of Unicorn to determine the transfer rate and consistency of the connection. This is probably the most important site baseline you can have. Whenever any connection entering or leaving the network segment seems slow, you can perform the same data transfer as in your baseline and compare the results. This will tell you if the connection is actually slow and possibly even help you find the area in which the slowness begins.
Host Baseline
Having a host baseline doesn’t mean that you must baseline every single host within your network. The host baseline should be performed on only high traffic or mission-critical servers. Basically, if a slow server will result in angry phone calls from management, you should have a baseline of that host.
Components of the host baseline include the following:
1. Protocols in use
This baseline provides a good opportunity to use the Protocol tab window while capturing traffic from the host. Later, you can compare against this to find out if normally present protocols are missing or if new protocols have introduced themselves on the host. You can also use this to find above ordinary amounts of certain types of traffic
based on protocol.
2. Idle/busy traffic
This baseline simply consists of general captures of normal operating traffic during peak and off-peak times. Knowing the number of connections and amount of bandwidth used by those connections at different points of the day will allow you to determine if slowness is a result of user load or another issue.
3. Startup/shutdown
In order to obtain this baseline, you will need to create a capture of the traffic generated during the startup and shutdown sequences of the host. If the computer refuses to boot, refuses to shut down, or is abnormally slow during either sequence, you can use this to determine if the cause is network-related.
4. Authentication sequences
This baseline requires capturing traffic from authentication processes to all services on the host. Authentication is one area where services are commonly slow. The baseline allows you to determine if authentication is to blame for slow communications.
5. Associations/dependencies
This baseline consists of a longer duration capture to determine what other hosts this host is dependent upon (and are dependent upon this host). You can use the Conversations tab to see these associations and dependencies. An example of this is a SQL Server host on which a web server depends. We are not always aware of the underlying dependencies between hosts, so the host baseline can be used to determine these. From there, you can determine if a host is not functioning properly due to a malfunctioning or high-latency dependency.
Application Baseline
The final network baseline category is the application baseline. This baseline should be performed on all business-critical network-based applications.
The following are the components on the application baseline:
1. Protocols in use
Again, for this baseline, use the Protocol tab window in Unicorn, this time while capturing traffic from the host running the application. Later, you can compare against this list to find out if protocols that the application depends on are functioning incorrectly or not at all.
2. Startup/shutdown
This baseline includes a capture of the traffic generated during the startup and shutdown sequences of the application. If the application refuses to start or is abnormally slow during either sequence, you can use this to determine the cause.
3. Associations/dependencies
This baseline requires a longer duration capture in which the Conversations window can be used to determine the other hosts and applications on which this application depends. We are not always aware of the underlying dependencies between applications, so this baseline can be used to determine those. From there, you can determine if an application is not functioning properly due to a malfunctioning or high-latency dependency.
4. Data-transfer rate
You can use the capture summary and graphing features of Unicorn to determine the transfer rate and consistency of the connections to the application server during its normal operation to create this baseline. Whenever the application is reported as being slow, you can use this baseline to determine if the issues being experienced are a result of high utilization or a high user load.
Additional Notes on Baselines
Here are a few more points to keep in mind when creating your network baseline:
When creating your baselines, do each one at least three times: once during a low-traffic time (early morning), once during a high-traffic time (mid-afternoon), and once during a no traffic time (late night).
When possible, avoid capturing directly from the hosts you are baselining. During periods of high traffic, this may put an increased load on the device, hurt its performance, and cause your baseline to be invalid due to dropped packets.
Your baseline will contain some very intimate information about your network, so be sure to secure it. Store it in a safe place where only the appropriate individuals have access. But at the same time, keep it close so that it remains functional for you. Consider keeping it on a USB flash drive or on an encrypted partition.
Keep all .pcap files associated with your baseline and create a “cheat sheet” of the more commonly referenced values, such as associations or average data-transfer rates.
Final Thoughts
This chapter has focused on troubleshooting slow networks. We’ve covered some of the more useful reliability detection and recovery features of TCP, demonstrated how to locate the source of high latency in network communications,
and discussed the importance of a network baseline and some of its components. Using the techniques discussed here, along with some of Unicorn’s graphing and analysis features (as discussed in Chapter 5), you should be well equipped to troubleshoot when you get that call complaining that the network is slow