What is Network Quality Analysis for Fault Detection

1. What is Network Quality Analysis (NQA )

Network Quality Analysis (NQA) is a real-time network performance detection and statistics technology that can collect statistics on network indicators such as response time, network jitter, and packet loss rate. Network Quality Analysis (NQA) can monitor network service quality in real time and perform effective fault diagnosis and location when a network failure occurs.

2. Why Network Quality Analysis (NQA ) is needed

With the development of value-added services for operators, user,s and operators have higher and higher requirements for QoS (Quality of Service). Especially after the traditional IP network carries voice and video services, it has become common for operators and customers to sign an SLA (Service Level Agreement).

To let users see whether the promised bandwidth meets their needs, operators need to provide relevant statistical parameters such as latency, jitter, and packet loss rate to timely understand the performance of the network. Traditional network performance analysis methods (such as Ping, Tracert, etc.) can no longer meet users’ requirements for service diversity and real-time monitoring.

Network Quality Analysis (NQA) can accurately test the network operation status and output statistical information. NQA can monitor the performance of multiple protocols running on the network, allowing operators to collect various network operation indicators in real-time, such as HTTP total delay, TCP connection delay, DNS resolution delay, file transfer rate, FTP connection delay, DNS resolution error rate, etc. By controlling these indicators, operators can provide users with different levels of network services. At the same time, Network Quality Analysis (NQA) is also an effective tool for network fault diagnosis and location.

3. How NQA works

Network Quality Analysis


NQA Client and Server

In Network Quality Analysis (NQA) testing, the two ends of the test are called the client and the server (or the source and the destination). The NQA test is initiated by the client (source). The client constructs a message that complies with the corresponding protocol according to the test type of the test case, adds a timestamp, and then sends it to the server.

The NQA server is responsible for processing the test messages sent by the NQA client and responding to the test initiated by the client by listening to the messages of the specified IP address and port number. The client calculates various performance indicators such as connectivity, delay, and packet loss rate based on the sent and received messages.

4. NQA test case processing mechanism

  • ICMP Test

ICMP testing determines the reachability of the destination and calculates the network response time and packet loss rate by sending ICMP messages.

The source sends a constructed ICMP Echo Request message to the destination. After receiving the message, the destination directly responds with an ICMP Echo Reply message to the source.

After the source receives the message, the communication time from the source to the destination is calculated by calculating the difference between the source receiving time and the source sending time, thereby clearly reflecting the network performance and network connectivity.

  • Trace Test

The trace test is used to detect the forwarding path from the source to the destination and record information such as the delay from the source device to each intermediate device along the path.

The process of trace testing is as follows:

  1. The client sends a constructed UDP message to the destination, and the TTL in the message is 1.
  2. After receiving the message, the first hop determines the TTL and discards the message, returning an ICMP timeout message.
  3. After receiving the ICMP timeout message, the client records the IP address of the first-hop device and reconstructs a UDP message with a TTL of 2.
  4. After receiving the message, the second hop determines the TTL and discards the message, returning an ICMP timeout message.
  5. And so on, the message finally reaches the last hop device, which returns an ICMP port unreachable message to the client.

After receiving the ICMP message returned by each hop, the client counts and prints the forwarding path from the client to the destination and the information of each device on the path, thus clearly reflecting the network status.

  • TCP Test

The TCP test is used to detect the speed of establishing a TCP connection between the client and the TCP Server through a three-way handshake.

The client calculates the time it takes to establish a TCP connection with the TCP Server through the three-way handshake by using the difference between the time it takes to receive the TCP SYN ACK message and the time it takes to send the TCP SYN message and ACK message, thus clearly reflecting the performance of the TCP protocol in the network.

  • UDP Test

Many services in the network are carried by the UDP protocol. Once the service quality deteriorates, there is no way to detect whether it is a problem with the service itself or a performance problem with the UDP bearer. NQA’s UDP test can be used to detect performance problems with the UDP bearer.

The source sends a constructed UDP message to the destination, and the destination responds to the source. After receiving the data packet, the source calculates the difference between the time the source receives the message and the time the source sends the message, and calculates the communication time between the source and the destination. This reflects the performance of the network UDP protocol.

  • DNS Test

The DNS test uses UDP packets as the carrier and simulates the DNS Client to send a domain name resolution request to the specified DNS server. The DNS server is available and the domain name resolution speed is determined based on whether the domain name resolution is successful and the time required for the domain name resolution.

  • FTP Test

The FTP test uses TCP packets as the carrier to detect whether a connection can be established with a specified FTP server and the speed of downloading or uploading a specified file from or to the FTP server.

  • HTTP Test

HTTP test is mainly to test whether the client can establish a connection with the specified HTTP server, so as to determine whether the device provides HTTP service and the time to establish the connection.

  • SNMP Test

The SNMP test is mainly used to detect the connectivity and communication speed of the SNMP protocol between the host and the SNMP Agent, using UDP packets as the carrier.

The source sends a constructed request message to the SNMP Agent, and the SNMP Agent responds to the source. After receiving the data packet, the source calculates the communication time between the source and the SNMP Agent by calculating the difference between the time the source receives the message and the time the source sends the message. This reflects the performance of the network SNMP protocol.

  • LSP Ping Test

The LSP Ping test is used to detect whether two types of LSP paths (LDP and TE) are reachable.

The source end first constructs an MPLS Echo Request message fills in the address of the 127.0.0.0/8 network segment as the IP destination in the IP header, searches for the corresponding LSP according to the configured peer LSR ID, and forwards the message within the MPLS domain according to the specified LSP. The destination end listens to port 3503 and sends an MPLS Echo Reply message.

The source end calculates the test results through the received response messages and calculates the difference between the source end receiving time and the source end sending time to calculate the communication time from the source end to the destination end, thereby clearly reflecting the smoothness of the MPLS network link.

  • LSP Trace Test

The LSP Trace test is used to detect two types of LSP forwarding paths (LDP and TE) and collect relevant statistics of each device along the path.

The source end first constructs a UDP MPLS Echo Request message fills in the address of the 127.0.0.0/8 network segment as the IP destination in the IP header, and searches for the corresponding LSP. The MPLS Echo Request message should contain the Downstream Mapping TLV (used to carry the downstream information of the LSP at the current node, mainly including the next hop address, outgoing label, etc.).

The TTL of the first MPLS Echo Request message is 1. The message is forwarded within the MPLS domain according to the specified LSP, and the MPLS Echo Reply message is returned after the TTL times out. The source end continues to send MPLS Echo Request messages with increasing TTLs and repeats this process until all LSRs on the entire LSP respond, and the LSP Trace test process is completed.

After receiving the response message from each LSR, the source end counts the LSP forwarding path from the source end to the destination end and the relevant information of each device on the path, thereby clearly reflecting the LSP forwarding path from the source end to the destination end.

  • PWE3 Ping Test

The PWE3 (Pseudo-Wire Emulation Edge to Edge) Ping test is used to check whether the PW path based on MPLS forwarding is reachable.

The source sends an MPLS Echo Request message and forwards it through the PW. After the message reaches the remote PE, it returns an MPLS Echo Reply message. The source calculates the test results through the received response messages and calculates the communication time from the source to the destination by calculating the difference between the source receiving time and the source sending time, thus clearly reflecting the smoothness of this PW path.

  • PWE3 Trace Test

The PWE3 Trace test is used to detect the MPLS-based PW forwarding path and collect relevant statistical information on each device along the path.

PWE3 Trace is a method where the source end continuously sends MPLS Echo Request messages with TTL values ​​from 1 to a certain value so that each node on the path returns an MPLS Echo Reply message after the TTL times out. The source end can collect information about each node on the PW, thus clearly reflecting the PW forwarding path from the source end to the destination end, as well as the relevant statistical information of each device on the path.

5. Typical Applications of NQA

5.1. Static Routing and NQA Association

Static routing itself does not have a detection mechanism. If a non-local direct link fails, the static route will not be automatically deleted from the IP routing table and administrator intervention is required. This makes it impossible to ensure timely link switching, which may cause long-term business interruption.

For the above reasons, an effective solution is needed to detect the link where the static route is located. For static routes, the existing static route and BFD linkage feature cannot be implemented in some application scenarios due to the limitation that both ends of the interconnected devices must support BFD. However, the linkage between static routes and Network Quality Analysis (NQA) only requires that one end of the interconnected device supports Network Quality Analysis (NQA).


Static routing and NQA linkage network

Use Network Quality Analysis (NQA) test instances to detect the status of the link where the static route is located. Based on the Network Quality Analysis (NQA) test results, determine whether the static route is active to avoid communication interruption or service quality degradation.

Take the above figure as an example. There are two links, primary and backup, from RouterA to RouterD. RouterA acts as an NQA client to detect the link status to RouterD:

  • If the NQA test instance detects a primary link failure, RouterA sets the static route to the “inactive” state.
  • If the NQA test instance detects that the primary link is restored, RouterA sets the static route to the “active” state.