How to Fix Hosts File Issues: DNS Caching Troubleshooting

Wireshark tutorials

Why is Hosts File Not Working?

My job simulates real-life failures, such as network packet loss, latency, CPU load, and disk fullness. Recently, while focusing on how to fix hosts file issues related to the “Domain name access is unavailable” feature, I discovered that, in some cases, it did not function as expected. To clearly explain the problem, let me first outline how I implemented the “domain name access is unavailable” feature. The process is straightforward: it involves directly modifying the /etc/hosts file. For instance, to simulate a scenario where access to www.baidu.com is unavailable, you can achieve this by doing the following:

ArduinoCode InterpretationCopy code127.0.0.1 odin.xiaojukeji.comwww.baidu.com #chaosblade

As a result, when you access it through a browser or terminal using curl www.baidu.com, an error will be reported.

Fix Hosts File Issues

The problem I encountered is that in some cases, even if I made the above settings, the result was still accessing Baidu’s server instead of the one I set127.0.0.1 .

The following will discuss go-httptwo failure scenarios, the failure phenomenon capture, and the causes.

Ping to Fix Hosts File Issues

Problem Recurrence

Our goal is to www.baidu.commap it locally local 127.0.0.1.

Before modifying /etc/hoststhe file, we first run it on the target machineping www.baidu.com

PythonCode InterpretationCopy code~$ ping www.baidu.com
PING www.baidu.com (110.242.68.4) 56(84) bytes of data.
64 bytes from 110.242.68.4: icmp_seq=1 ttl=55 time=3.32 ms
64 bytes from 110.242.68.4: icmp_seq=2 ttl=55 time=4.39 ms
64 bytes from 110.242.68.4: icmp_seq=3 ttl=55 time=2.39 ms
64 bytes from 110.242.68.4: icmp_seq=4 ttl=55 time=3.66 ms

Without interrupting the above ping commands, domain name access is unavailable ( /etc/hostsfile modification 127.0.0.1 www.baidu.com #chaosblade).

After modifying /etc/hoststhe file, ping www.baidu.comthe result found has not changed from 110.242.68.4.

But at this time, open another terminal run it, and find that it is in line with expectations!

PythonCode InterpretationCopy code~$ ping www.baidu.com
PING www.baidu.com (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.014 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.007 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.011 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.009 ms
64 bytes from localhost (127.0.0.1): icmp_seq=5 ttl=64 time=0.007 ms
64 bytes from localhost (127.0.0.1): icmp_seq=6 ttl=64 time=0.006 ms

Is there a cache in the implementation?

How to Fix Hosts File Issue

It depends on pingthe implementation principle of the command.

After interpreting the source code, the implementation is as follows: before entering icmpthe logic of continuously sending data packets, the domain name specified by the user will only be resolved once. Therefore, in the process, even if the file is modified, the running pingprocess will not be aware of it.

Fix Hosts File Issue: http-go

Problem Recurrence

Later I discovered that a similar situation occurred when gomaking a request.

For example, in the following example, a request for the secondary homepage is made http get, and the interval between each request is 3s. During the program running, modify /etc/hoststhe file to see if the request can be made and if the response can be received normally.

goCode InterpretationCopy code# Test code
func TestHTTPGet(t *testing.T) {
url := "http://www.baidu.com"
for i := 0; i < 50; i++ {
resp, err := http.Get(url)
if err != nil {
t.Error(err)
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Error(err)
}
fmt.Println(string(body))
time.Sleep(3 * time.Second)
}
}

Packet capture display 1: details

hosts file troubleshooting

Packet capture display 2: Statistics

It was found that the library did not point to because hoststhe file was changed.gohttpwww.baidu.comIP127.0.0.1

How to Fix Hosts Fie Issue

Through packet capture analysis, this request uses long connection technology because the source port is 58443, of course, this also conforms HTTP 1.1 persistent connectionto the characteristics.

You may have questions. Looking at the request header, there is no setting HTTPindicating a long connection.persistent connection

Indeed, no, but this is only because HTTP/1.1persistent connections are enabled by default.

Other ways to reproduce

The above gocode needs to be run and the packet needs to be captured to confirm that the request is a long connection. Therefore, even if the file is modified, no new dns query is performed, resulting in the use dnsof the first query ipinstead of the setting 127.0.0.1.

Is there a way to know that dnsthe query was only performed once without capturing the packet? Of course.

We can use the standard library trace.

goCode InterpretationCopy codefunc createHTTPTraceRequest(ctx context.Context, url string) (*http.Request, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return req, err
}

// Add some behavior log printing
trace := &httptrace.ClientTrace{
GotConn: func(info httptrace.GotConnInfo) { // Get the underlying connection, http sends the request through the underlying connection
fmt.Printf("GotConn info:%+v\n", info)
},
DNSStart: func(info httptrace.DNSStartInfo) { // Perform DNS query and print it
fmt.Printf("DNSStart info:%+v\n", info)
},
DNSDone: func(info httptrace.DNSDoneInfo) { // Complete dns query, print
fmt.Printf("DNSDone info:%+v\n", info)
},
}
traceCtx := httptrace.WithClientTrace(ctx, trace)
req = req.WithContext(traceCtx)
return req, nil
}

func TestWithTraceHTTP(t *testing.T) {
url := "http://www.baidu.com"
for i := 0; i < 50; i++ {
// Build request structure
req, err := createHTTPTraceRequest(context.Background(), url)
if err != nil {
t.Error(err)
}
// Send request
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Error(err)
}
defer resp.Body.Close() _, err = ioutil.ReadAll(resp.Body) if err != nil { t.Error(err) } time.Sleep(3 * time.Second) } }

The output is as follows:

CSSCode InterpretationCopy codeDNSStart info:{Host:www.baidu.com}
DNSDone info:{Addrs:[{IP:110.242.68.3 Zone:} {IP:110.242.68.4 Zone:} {IP:2408:871a:2100:2:0:ff:b09f:237 Zone:} {IP:2408:871a:2100:3:0:ff:b025:348d Zone:}] Err:<nil> Coalesced:false}
GotConn info:{Conn:0x14000186000 Reused:false WasIdle:false IdleTime:0s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001353958s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001554083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00088375s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00107s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001237583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001469083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002459209s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00587425s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001838042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.007866292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00083075s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000906125s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001565917s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002019208s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001439292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000538625s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001538042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001537375s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000745167s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000566916s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001680083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001349s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001143166s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00183775s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001327459s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001370041s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001351875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001454833s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001327166s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000833042s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001082209s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001355333s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.002008875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001410666s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000633791s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001041583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000719583s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000613334s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000894167s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000837292s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001611s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001473875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000864875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000621208s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.000998083s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001407s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00110875s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.00126275s}
GotConn info:{Conn:0x14000186000 Reused:true WasIdle:true IdleTime:3.001923709s}

If you look at the above log information carefully, you will find the following facts:

1. Only one query was performed for each request, which was the root cause of the failure to modify the file as expected.

2. Only when the connection is sent for the first time, a new underlying connection is created, and all subsequent requests reuse the previous underlying connection.

How do you know?

The fields indicate a new connection and indicate reuse of the underlying connection.

Summary

When the operating system receives a DNS query request, it first checks the hosts file located at /etc/hosts to determine if the domain name has a corresponding configuration in the file. If no configuration is found, it calls the DNS service to perform a recursive query. Typically, after modifying the /etc/hosts file, we expect these changes to take effect immediately, which they usually do. However, to fix hosts file issues, there are certain special cases where the modifications may not immediately reflect, such as with the ping command and HTTP requests described above.

I think we can call it a caching problem.

Ahahaha, this is one of the two biggest problems in the world of computers:

  • name
  • cache
Share this