Understanding TTL (Time to Live) in Network Troubleshooting and Diagnostics

Overview

When it comes to network troubleshooting, Time to Live (TTL) is a crucial factor to consider. But what exactly is Time to Live, and why is it so significant? In networking, Time to Live manages the number of hops a data packet can take across routers before being discarded. This mechanism prevents data loops and assists in diagnosing network issues through tools like traceroute, which map paths and detect anomalies such as firewalls or routing changes. Understanding Time to Live’s role in network troubleshooting can provide valuable insights into network performance and stability.

What Is Time to Live (TTL) and Its Role in Networking?

TTL is defined in the IPv4 protocol as follows

ip protocol

RFCSpecifies 8a placeholder , that is, a value 0-255.

TTL, in fact, it records the number of path hops in the network.

The Time-to-Live field, or TTL, sets an upper limit on the number of routers through which a datagram can pass. It is initialized by the sender to some value (64 is recommended [RFC1122], although 128 or 255 is not uncommon) and decre- mented by 1 by every router that forwards the datagram. When this field reaches 0, the datagram is thrown away, and the sender is notified with an ICMP message (see Chapter 8). This prevents packets from getting caught in the network forever should an unwanted routing loop occur.

The above words can be summarized as follows:

1. Every time a data packet passes routerTTLthe value decreases1

2. TTLIf the value is 0, the routerresponsible party returns a ICMPmessage to inform the sender that TTLthe value is exhausted .

3. The initial value is TTLusually 64,128255

4. TTLThe purpose of existence is to prevent data packets from entering an infinite loop and wasting network bandwidth

But do you have this question: TTL, that is Time to Live, it looks like a time limit, a time unit, why is it actually the maximum routernumber allowed to pass?

The TTL field was originally specified to be the maximum lifetime of an IP data- gram in seconds, but routers were also always required to decrement the value by at least 1. Because virtually no routers today hold on to a datagram longer than 1s under normal operation, the earlier rule is now ignored or forgotten, and in IPv6 the field has been renamed to its de facto use: Hop Limit.

Oh, I see. It did indicate time at first TTL, but later in actual projects, it was found that in normal scenarios, routera data packet would not be held for more than 1 second 1s, so TTLthe meaning changed. IPv6I will not install it, so it is called Hop Limit.

Yes, TTLthat’s what Hop LimitI meant.

Usage 1. Using Time to Live for Operating System Fingerprinting

TTLThe default value is different for different operating systems , windowsand is for 128class unixsystems 64.

Therefore, we can TTLinfer the type of the other end based on the captured packets OS.

This should be a kind of unixoperating system.

3 handshake

This should be an windowsoperating system.

Usage 2: Identifying Packet Source Using Time to Live Values

For example, if you are given a packet capture file containing HTTP requests and responses and you open it with Wireshark, how can you determine whether the file is captured from the HTTP client or the HTTP server?

talk is cheap, show me the wireshark pcap

This requires the above knowledge:

The default TTL value for different operating systems is different. Windows is 128, and Unix-like systems are 64.

It is initialized by the sender to some value (64 is recommended [RFC1122], although 128 or 255 is not uncommon

If you see the same initial value in the packet you captured TTL=64/128, it is likely captured on the current end. For example:

Understanding TTL (Time to Live)
3-way handshake

Click on 1the details of the number package and find IPthe protocol TTL=128, then you can probably assume that these network packets are src=10.5.28.229captured on this machine.

Usage 3: How Time to Live Helps Detect Firewalls in the Network

Before talking about this section, let me first say a piece of knowledge:

For the external network, the same connection TTLmay change;

For the intranet, TTLthere will be basically no changes on the same link;

What does this mean?

For the intranet, the network topology is relatively stable and single. But for the extranet, the same TCPconnection IPmay have different routes. This is also the origin of the name of the Internet. From Apoint to pointB , there are many routes, and some paths may not be accessible every time.

In other words, in an intranet environment, if TTLa change is found in a connection, it should be taken seriously, and it is very likely caused by the intranet firewall.

The following is an example from Geek Time’s “Network Troubleshooting Case Study”. Specifically, in the initial stage of a connection, the three-way handshake succeeds, and the subsequent TLSfour-way handshake clientfails RST.

TLS handshake failed

This is clientthe packet captured on the client. First, let’s check the packet serversent by the client.SYN+ACK

Then check the package that seems to serverbe sent from the endRST

Ahaha, through TTLcomparison, the flaw was immediately exposed. This RSTwas sent by a device in the middle – the firewall, which interrupted the TLSfour-way handshake.

Usage 4. Time to Live in Network Tools Like Traceroute

Detection tools, for example traceroute, rely on IPprotocols TTL.

First, let me briefly introduce traceroutewhat it does, pingwhat is the difference between it and detection, and what is its principle?

You should be familiar with it ping. We generally use it ping+hostto determine whether your host hostcan be connected, as well as the round-trip time between them RTT.

So traceroutewhat?

hostIt goes a step further and lists in detail every path that the packet passes through on the way to the host.router

The primary difference between ping and traceroute is that while ping simply tells you if a server is reachable and the time it takes to transmit and receive data, traceroute details the precise route info, router by router, as well as the time it took for each hop.

pingThe principle of and tracerouteis to send ICMPdata packets. tracerouteThe reason why the statistical information of each hop can be listed routeris inseparable IPfrom the protocol TTL.

Let’s look at an example:

$ traceroute -I  -q 1  www.baidu.com   
                               
traceroute: Warning: www.baidu.com has multiple addresses; using 110.242.68.3
traceroute to www.a.shifen.com (110.242.68.3), 64 hops max, 72 byte packets
 1  bogon (172.24.159.253)  8.894 ms
 2  36.110.17.17 (36.110.17.17)  141.932 ms
 3  *
 4  *
 5  *
 6  *
 7  219.158.41.1 (219.158.41.1)  67.226 ms
 8  *
 9  *
10  110.242.66.182 (110.242.66.182)  24.004 ms
11  221.194.45.130 (221.194.45.130)  32.467 ms
12  *
13  *
14  *
15  *
16  110.242.68.3 (110.242.68.3)  38.774 ms

tracerouteIt basically lists www.baidu.comevery routerpiece of information that is required to get there. How does it do that?

Directly look at the capture file

icmp request

Ah ha, that’s how it is. The detection packet sent from this machine ICMPhas its IPlayer accumulated TTLfrom 1the beginning.

Then rely on TTLthe principle:

Every time after one routerTTLminus 1, when TTLit is 0, the person routerresponsible for returning a ICMP Time Exceededdata packet.

Traceroute sends packets with TTL values that gradually increase from packet to packet, starting with TTL value of one. Routers decrement TTL values of packets by one when routing and discard packets whose TTL value has reached zero, returning the ICMP error message ICMP Time Exceeded

Please see ICMPthe reply package

icmp reply

Time-to-live exceededSure enough , it was returned . tracerouteThis gradually increasing method was used to TTLdetect the path between the local host and the destination host.

But you may still have doubts:

1. traceroute -I -q 1 www.baidu.com, indicates that the network packet detection path tracerouteis specified ICMP. Is there any other way to detect network packets?

2. Why are the routes in the middle the same ****? Is there something wrong?

3. There is an error in the middle, but why can the following parts still be displayed normally?

……

At this point, we have understood that TTLit is one of the underlying supports for realizing traceroutethe function, but traceroutethere are still many things to explore about itself. I will write a special article to discuss it later, so stay tuned.

Summary

I believe that through my introduction, you will have a little understanding TTLof its function.

Do you have any other application scenarios? Welcome to share with me~