1. Introduction: TLS Fingerprint Blocking
In previous projects, it was discovered that some websites produce different results depending on the client used. For example, accessing the site with a browser might work fine, but using a script written in Python or making requests with curl would be blocked. Attempts were made to replicate data packets exactly, but this still did not resolve the issue. This behavior is often related to TLS Fingerprint Blocking.
Test fingerprint blocking site: https://ascii2d.net
Recently, I read an article by a master titled “Bypassing Cloudflare Fingerprint Shield“, which was very insightful. It seems that the issue I previously encountered was similar; when writing crawlers and facing similar fingerprint shields (anti-crawling mechanisms), using Selenium to emulate the browser was attempted in the past as a workaround. This time, I have gained new perspectives and learned something new.
The content is mainly divided into two parts: 1. Bypassing TLS fingerprint recognition, 2. Bypassing Akamai fingerprint (HTTP/2 fingerprint) recognition.
2. Related to TLS Fingerprint Blocking
2.1. What is TLS Fingerprint Blocking?
TLS fingerprint is a technique used to identify and verify TLS (Transport Layer Security) communications.
It can identify the characteristics of TLS communication by examining the **cipher suites, protocol version, and encryption algorithms** used during the TLS handshake. Since different TLS implementations use different cipher suites, protocol versions, and encryption algorithms, comparing TLS fingerprints can determine whether the communication is from the expected source or target.
TLS fingerprints can detect security threats like spoofing, man-in-the-middle attacks, and espionage, and can also be used for device and application identification and management.
The principle of TLS fingerprint recognition (ja3 algorithm): https://github.com/salesforce/ja3
ja3 algorithm
2.2. Testing TLS Fingerprint Blocking
Testing the fingerprint differences (ja3_hash) between different clients.
For in-depth analysis, you can use Wireshark to capture and analyze TLS packets.
Test site: https://tls.browserleaks.com/json
- CURL v7.79.1
curl v7.79.1
- CURL 7.68.0
CURL 7.68.0
- Chrome 112.0.5615.137 (Official Build) (x86_64)
Chrome 112.0.5615.137 (Official Build) (x86_64)
- Burp Chromium 103.0.5060.114 (Official Build) (x86_64)
Burp Chromium
- Python 2.11.1
Python 2.11.1
It is apparent that different clients have variations. A simple explanation for the last Python ja3_text is as follows:
- The first value
771
: Represents the JA3 version, which is the version of the JA3 script used to generate the fingerprint. - The second value
4866-4867-4865-49196-49200-49195-49199-163-159-162-158-49327-49325-49188-49192-49162-49172-49315-49311-107-106-57-56-49326-49324-49187-49191-49161-49171-49314-49310-103-64-51-50-52393-52392-49245-49249-49244-49248-49267-49271-49266-49270-52394-49239-49235-49238-49234-196-195-190-189-136-135-69-68-157-156-49313-49309-49312-49308-61-60-53-47-49233-49232-192-186-132-65-255
: Represents the cipher suite, i.e., the encryption algorithms supported by the client. - The third value
0-11-10-35-22-23-13-43-45-51-21
: Represents the supported compression algorithms. - The fourth value
29-23-30-25-24
: Represents the supported TLS extensions, such as SNI. - The fifth value
0-1-2
: Represents the supported elliptic curves, i.e., elliptic curve algorithms.
2.3. Bypassing TLS Fingerprint Blocking
Since we know the principle, bypassing involves masquerading as a legitimate client. In simple terms, it means disguising the ja3_text value so that it isn’t intercepted, primarily by modifying the supported encryption algorithms.
2.3.1. Method Zero: Use Native urllib for TLS Fingerprint Blocking
import urllib.request
import ssl
url = 'https://tls.browserleaks.com/json'
req = urllib.request.Request(url)
resp = urllib.request.urlopen(req)
print(resp.read().decode())
# Forge TLS fingerprint
context = ssl.create_default_context()
context.set_ciphers("ECDHE-RSA-AES128-GCM-SHA256 ECDHE AESGCM")
url = 'https://tls.browserleaks.com/json'
req = urllib.request.Request(url)
resp = urllib.request.urlopen(req, context=context)
print(resp.read().decode())
urllib
2.3.2. Method One: Use Other Established Libraries
You can try the curl_cffi
library, which focuses on emulating various fingerprints.
Python binding for curl-impersonate via cffi. An HTTP client that can impersonate browser TLS/ja3/HTTP2 fingerprints.
In addition, you can also try pyhttpx, pycurl
pip install --upgrade curl_cffi
Test code:
from curl_cffi import requests
print("edge99:", requests.get("https://tls.browserleaks.com/json", impersonate="edge99").json().get("ja3_hash"))
print("chrome110:", requests.get("https://tls.browserleaks.com/json", impersonate="chrome110").json().get("ja3_hash"))
print("safari15_3:", requests.get("https://tls.browserleaks.com/json", impersonate="safari15_3").json().get("ja3_hash"))
# Support proxy
proxies = {"https": "http://localhost:7890"}
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome101", proxies=proxies)
print(r.json().get("ja3_hash"))
The effect is as follows:
curl_cffi
The supported browser spoof list is as follows:
# curl_cffi.requests.session.BrowserType
class BrowserType(str, Enum):
edge99 = "edge99"
edge101 = "edge101"
chrome99 = "chrome99"
chrome100 = "chrome100"
chrome101 = "chrome101"
chrome104 = "chrome104"
chrome107 = "chrome107"
chrome110 = "chrome110"
chrome99_android = "chrome99_android"
safari15_3 = "safari15_3"
safari15_5 = "safari15_5"
2.3.3. Method Two: Add a Client Proxy Layer
Here, Burp is used to complete the TLS certification process, provided Burp’s TLS fingerprint is not intercepted.
burp
Burp’s TLS fingerprint can be modified in the following way
Modify burp TLS fingerprint
2.3.4. Method Three: Modify the Underlying Code of Requests
The Requests library’s SSL/TLS authentication is based on the urllib3 library, so modifying the underlying code involves changing the urllib3 code.
Check the installation location of urllib3
python3 -c "import urllib3; print(urllib3.__file__)"
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/__init__.py
Modify relevant SSL code, typically located at site-packages/urllib3/util/ssl_.py
DEFAULT_CIPHERS = ":".join(
[
"ECDHE AESGCM",
"ECDHE CHACHA20",
"DHE AESGCM",
"DHE CHACHA20",
"ECDH AESGCM",
"DH AESGCM",
"ECDH AES",
"DH AES",
"RSA AESGCM",
"RSA AES",
"!aNULL",
"!eNULL",
"!MD5",
"!DSS",
]
)
There is a lot of room for operation. As a script kiddie, I mostly stick to deleting and rearranging positions, as shown below:
vs
3. Akamai Fingerprint Related (HTTP/2 Fingerprint)
3.1. What is Akamai Fingerprint
Akamai Fingerprint is a technology provided by Akamai Technologies to prevent malicious bots and automated attacks, based on browser fingerprint recognition technology.
Browser fingerprinting is a technique used to identify web browsers by collecting and analyzing various attributes and behaviors, such as user-agent strings, plugins, fonts, language, screen resolution, and more to identify browsers. Browser fingerprinting has been widely used in Internet security, for detecting and identifying malicious bots, fraudulent actions, phishing, etc.
Akamai Fingerprint incorporates browser fingerprinting and combines it with other security technologies to identify and block automated attacks. It can identify and verify the browsers accessing the site without affecting the user experience, preventing automated attacks, account abuse, and data leaks.
You can view detailed fingerprints on https://tls.peet.ws/api/all, which mainly include the following:
HTTP2
Fingerprint is: 1:65536,2:0,3:1000,4:6291456,6:262144|15663105|0|m,a,s,p
1:65536
:HEADER_TABLE_SIZE
, which means the header table size is 64KB, referring to the size used for storing request and response headers. This field indicates a 64KB header table size.2:0
:HTTP2_VERSION
, indicates the HTTP/2 version used for this request. 0 implies H2, meaning the HTTP/2 protocol is enabled.3:1000
:MAX_CONCURRENT_STREAMS
, which stands for the maximum number of concurrent streams, indicating the maximum number of requests the client and server can send in parallel at any given time. This field indicates a maximum count of 1000 concurrent streams.4:6291456
:INITIAL_WINDOW_SIZE
, which refers to the initial stream window size, indicating the maximum amount of bytes the client can send. This field indicates an initial stream window size of 6MB (i.e., 6291456 bytes).- 6:262144|15663105|0|m,a,s,p
: Separated by vertical bars '|'. Their specific meanings are as follows:
- `6:262144`: `max header list size`, referring to the dynamic table size allowed, indicating the maximum HTTP header size the receiver can accept. This field indicates a dynamic table size of 256KB (i.e., 262144 bytes).
- `15663105`: `WINDOW_UPDATE`, indicating a `WINDOW_UPDATE` frame was received and the window size increased by 15663105 bytes.
- `0`: `no compression`, indicating that header compression is not enabled.
- Encodes the first character of headers starting with ':', separated by commas, such as `:method`, `:authority`, `:scheme`, `:path`, encoded as `m,a,s,p`.
Details can be found in Passive Fingerprinting of HTTP/2 Clients.
3.2. Testing Akamai Fingerprint
Test site: https://tls.browserleaks.com/json
- CURL
curl
- Chrome
chrome
- Python
python
It can be seen that using Python requests results in an empty response, with the crawler being blocked outside.
3.3. Bypassing Akamai Fingerprint
For specific fields in the forged fingerprints.
3.3.1. Method One: Use Other Established Libraries
Again, the curl_cffi
library, which focuses on emulating various fingerprints.
Python binding for curl-impersonate via cffi. A HTTP client that can impersonate browser TLS/ja3/HTTP2 fingerprints.
pip install --upgrade curl_cffi
Test code:
from curl_cffi import requests
print("edge99:", requests.get("https://tls.browserleaks.com/json", impersonate="edge99").json().get("akamai_hash"))
print("chrome110:", requests.get("https://tls.browserleaks.com/json", impersonate="chrome110").json().get("akamai_hash"))
print("safari15_3:", requests.get("https://tls.browserleaks.com/json", impersonate="safari15_3").json().get("akamai_hash"))
The effect is as follows:
akamai
The supported browser spoof list is as follows:
# curl_cffi.requests.session.BrowserType
class BrowserType(str, Enum):
edge99 = "edge99"
edge101 = "edge101"
chrome99 = "chrome99"
chrome100 = "chrome100"
chrome101 = "chrome101"
chrome104 = "chrome104"
chrome107 = "chrome107"
chrome110 = "chrome110"
chrome99_android = "chrome99_android"
safari15_3 = "safari15_3"
safari15_5 = "safari15_5"
4. Final Effect
https://ascii2d.net has Cloudflare’s fingerprint shield, denying crawlers. Let’s test it.
Direct CURL, blocked
banned
Bypass
from curl_cffi import requests
req = requests.get("https://ascii2d.net", impersonate="chrome110")
print(req.text)
Page can be accessed normally
normal