Understanding Tracker Requests: A Comprehensive Guide to HTTP GET in BitTorrent Protocol

0. Review

Previous article:

Review of previous content:

  • BitTorrent is a protocol used for distributing files; it breaks down the files to be distributed into fragments and passes them between nodes;
  • BitTorrent uses metadata files to describe the files to be distributed, and the metadata files use bencode encoding;
  • The data structure of metadata files (torrent files)
  • Data verification performs SHA-1 hash calculations on fragments for comparison;

1. Tracker

Tracker GET Request

First, it is important to know that a Tracker request is based on an HTTP request, typically using the GET method. A Tracker GET request should include the following information:

  1. info_hash (Hash): The SHA-1 hash of the info field from the metadata file after encoding; special attention should be paid to adhering to the sorting rules and other regulations specified in the encoding
  2. peer_id (Peer Identifier): A 20-character string that identifies the downloader’s ID, usually generated by certain rules when a new download task is created.
  3. ip (IP Address), optional: The IP address (or DNS name), typically only used when the Tracker and downloader are on the same device.
  4. port (Port Number): The listening port number, usually described in BEP3 as follows: The downloader tries to listen on port 6881, and if that port is busy, it tries ports 6882, then 6883, and so on, up to 6889. If all these are busy, it gives up. Nowadays, many downloaders have their own default or will choose a random port number.
  5. uploaded (Uploaded Amount): The total uploaded amount, encoded in decimal ASCII.
  6. downloaded (Downloaded Amount): The total downloaded amount, encoded in decimal ASCII.
  7. left (Remaining Amount to be Downloaded): The remaining bytes to be downloaded, represented in decimal ASCII encoding. Note, this can’t be calculated from downloaded amount and file length, because it might be a resumed transfer, and maybe the already downloaded data has not passed integrity checks, needing to be downloaded again.
  8. event (Event), optional: Contains started, completed, or stopped; if empty, it indicates a periodic communication during the interval and is equivalent to the absence of this key. started indicates that the download has just begun; completed signifies the download is complete, and if the file is already complete at the start, ‘completed’ will not be sent, and stopped should be sent when stopped.

The essence of a Tracker request is an HTTP GET request. Using an example torrent from the metadata file section, we deploy a tracker server within a local network (process omitted) for the request example.

First, calculate the hash value of the info section, as follows:

Info content:

JSON languageCopy

{        "length": 1373744,        "name": "ChromeSetup.exe",        "piece length": 524288,        "pieces": b"L\xb2k\xd9\x83\xa4\x84\x84\x00g\xeb\xf7\x1d\xfe3\xa2\xd9\x95\x0f\\\xa6\xb2E\xcd!^\xe3\xed\x8a\x85\xe7>(\x99\x9dU\x06g%b\x08@\xc9\x9fG\xb8S\x8f\x067K#3\xa7\xbf\xb8`N\xac3"}

Use the previously mentioned encode_bencode function to compute encoding, then calculate SHA1 and URL encode the result:

Text languageCopy

%E7%D6%A1%A7%88-%E0%11%0E%3C%BB%FBP%91%FB%DE%EBg%1E%C1

Structure a start download request based on Tracker request structure, as follows:

  • info_hash=%e7%d6%a1%a7%88-%e0%11%0e%3c%bb%fbP%91%fb%de%ebg%1e%c1
  • peer_id=-None-ARandomString-, making sure it’s 20 characters long
  • ip=127.0.0.1
  • port=6881
  • uploaded=0
  • downloaded=0
  • left=1373744, nothing downloaded yet, left equals the file size
  • event=started, start downloading

The final structured request is as follows:

Text languageCopy

{TrackerURL}?info_hash=%f3%e4%3a%1c!)C%e2%18%eav%a0%1d%5d%c5%9b%d1%88%e6%a1&peer_id=-None-ARandomString-&port=6881&uploaded=0&downloaded=0&left=0&event=started

Tracker GET Response

Sending the request above yields the following Tracker response:

tracker request />Tracker response content

Decode it to get the return dictionary:

JSON languageCopy

{    "complete": 0,    "downloaded": 0,    "incomplete": 1,    "interval": 1863,    "mininterval": 931,    "peers": b'\n\x00\xb29\x1a\xe1'}

This is a successful request format; next, let’s specifically look at the Tracker’s response content.

If an error occurs, only a failure reason is needed, requiring no further content.

If it’s a successful response, the response content should include:

  • interval: Interval in seconds after which the downloader should make the next request, under normal circumstances.
  • peers: List of peer information in a list format, where each piece of information is a dictionary, containing:
    • peer id: Peer ID, string
    • ip: IP address or DNS name, string
    • port: Port number, integer

As noted above, it can be easily discovered that the previous test Tracker’s returned peers information was not in the standard format. This is explained in BEP0023 regarding the compact format which returns the peers list. In the compact format, each peer information consists of a 4-byte IPv4 address and a 2-byte port number, no longer including Peer ID.

It should be noted that since the compact format is recommended, many Trackers only support this mode of response. However, as a downloader, it must support both formats.

Analyzing the request above, we can conclude: Tracker expects the next request in 1863 seconds, and the peers list:

JSON languageCopy

[{'ip': '10.0.178.57', 'port': 6881}]

2. Peers Handshake

The BitTorrent protocol is peer-to-peer, with no concept of server and client. Every node (Peer) is the same, and the way they transmit data to each other is consistent.

Using TCP connections as an example, nodes first establish a TCP connection and then begin a handshake, with handshake data as follows:

  • 1 byte for protocol name length, fixed at 19 (0x13);
  • 19 bytes for the protocol name, fixed at BitTorrent protocol; Note: Hereafter, all integers are encoded in a 4-byte big-endian format;
  • The first 8 bytes after the handshake are reserved for marking extension protocols, and if not considering extension protocols, their values should be 0;
  • Information hash, as previously mentioned, the 20-byte SHA1 result, typically, the handshake parties should have identical content here. If multiple downloads are needed, the respondent should respond with the same hash;
  • Peer ID, if the Tracker uses the standard format to transmit the node list, PeerID verification is required, disconnecting unsuccessful verifications;

Both parties send the data above, verify each other, completing the handshake process. A zero-length keep-alive message is typically sent every two minutes, with a shorter timeout during data transmission requests.

3. Peers Data Transmission (Keyword: tracker request)

It is recommended to read this in conjunction with Brief Analysis of Bittorrent Protocol (Part Three) Peer Data Transfer Example.

After completing the handshake, both parties can begin exchanging data. All non-keep-alive (zero-length) data begin with a single byte. The opening byte descriptions:

Indicator

Description

0

choke

1

unchoke

2

interested

3

not interested

4

have

5

bitfield

6

request

7

piece

8

cancel

The first four items, choke and interested, have the following meanings:

  • Choked or Unchoked: This indicates whether a peer allows data to be transmitted to the other side. When a Peer is choked, it does not send data to the other side until the choke is lifted.
  • Interested or Not Interested: This indicates whether a peer wants the other side to transmit data. If a peer is interested in another peer’s data, it requests data blocks.

When the connection is established, the default state is choked and not interested.

  • have: After the downloader completes the download and hash verification of a data block, it informs other nodes of this via have. The have content includes the integer index of the fragment.
  • bitfield: Bitfield is sent only once after the connection is established; it informs other nodes of the data fragments it already possesses in a bitfield format. It is important to note that if the sender has no data blocks upon connection establishment, it may choose to skip sending the ‘bitfield’ message, which is not mandatory.
  • request and piece: A node can request from other nodes via request or provide data to others with piece. Requests include the integer index of the fragment, starting data offset, and fragment size, which can also be viewed as the size of the requested data. Provision includes data length, start mark (7), and the data itself.
  • cancel: cancel shares the same payload as the request message. For efficient downloading, a downloader may request the same fragment from multiple nodes simultaneously. Once a fragment is acquired and verified, it informs other nodes to stop sending via cancel.

Tracker and Peer Node Section Finished

The second part of the Tracker and peer nodes ends here. Practical analysis and extension protocol related content links will be provided later here:

Finally, an ad for the essay contest:

I am participating in the 2023 Tencent Technology Creation Bootcamp Second Phase Prize Essay Contest, sharing a prize pool of ten thousand and keyboard watches