Browser Fingerprinting - What It Is and How to Avoid Detection
Understand how websites use browser fingerprinting to detect scrapers and learn techniques to avoid fingerprint-based detection.
Browser fingerprinting goes far beyond User-Agent strings. Websites collect dozens of signals to create a unique identifier for your browser, making it very difficult to hide automated tools.
What Gets Fingerprinted?
Anti-bot systems collect and hash these signals:
- Canvas fingerprint, rendering a hidden canvas element produces unique pixel data per GPU/driver
- WebGL fingerprint, GPU vendor, renderer, and capabilities
- TLS/JA3 fingerprint, the pattern of your SSL handshake
- Audio context, differences in audio processing across hardware
- Screen and window, resolution, color depth, device pixel ratio
- Fonts, which system fonts are installed
- Navigator properties, plugins, languages, platform, hardware concurrency
How Headless Browsers Get Caught
Headless Chrome leaks several detectable signals:
// These properties reveal headless mode
navigator.webdriver // true in headless
navigator.plugins.length // 0 in headless
navigator.languages // may be empty
window.chrome // missing in headless
Patching Fingerprints with playwright-stealth
The playwright-stealth package patches common detection points:
from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
def scrape_with_stealth(url):
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context(
viewport={"width": 1920, "height": 1080},
screen={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
locale="en-US",
timezone_id="America/New_York",
)
page = context.new_page()
# Apply stealth patches
stealth_sync(page)
page.goto(url, wait_until="networkidle")
content = page.content()
browser.close()
return content
html = scrape_with_stealth("https://bot.sannysoft.com")
Install with pip install playwright-stealth.
Fixing TLS Fingerprints
Python's requests library has a distinctive TLS fingerprint. Use curl_cffi to impersonate a real browser:
from curl_cffi import requests
# Impersonate Chrome's TLS fingerprint
session = requests.Session(impersonate="chrome124")
response = session.get("https://tls.browserleaks.com/json")
print(response.json()["ja3_hash"])
Testing Your Fingerprint
Use these sites to check what your scraper leaks:
- bot.sannysoft.com, comprehensive headless detection tests
- browserleaks.com, canvas, WebGL, font fingerprints
- tls.browserleaks.com, TLS/JA3 fingerprint analysis
The Simplest Solution
Rather than fighting fingerprint detection yourself, use ScraperAPI or ScrapingAnt which maintain real browser pools with authentic fingerprints.
Key Takeaways
- Fingerprinting catches what IP rotation and UA rotation miss
- Always use
playwright-stealthor equivalent patches with headless browsers - Fix your TLS fingerprint with
curl_cffifor HTTP-level scraping - Test your fingerprint before scraping production targets