Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Browser Fingerprinting - What It Is and How to Avoid Detection

Understand how websites use browser fingerprinting to detect scrapers and learn techniques to avoid fingerprint-based detection.

Anti-Detection · #8advanced2 min read
Share:WhatsAppLinkedIn

Browser fingerprinting goes far beyond User-Agent strings. Websites collect dozens of signals to create a unique identifier for your browser, making it very difficult to hide automated tools.

What Gets Fingerprinted?

Anti-bot systems collect and hash these signals:

  • Canvas fingerprint, rendering a hidden canvas element produces unique pixel data per GPU/driver
  • WebGL fingerprint, GPU vendor, renderer, and capabilities
  • TLS/JA3 fingerprint, the pattern of your SSL handshake
  • Audio context, differences in audio processing across hardware
  • Screen and window, resolution, color depth, device pixel ratio
  • Fonts, which system fonts are installed
  • Navigator properties, plugins, languages, platform, hardware concurrency

How Headless Browsers Get Caught

Headless Chrome leaks several detectable signals:

// These properties reveal headless mode
navigator.webdriver           // true in headless
navigator.plugins.length      // 0 in headless
navigator.languages           // may be empty
window.chrome                 // missing in headless

Patching Fingerprints with playwright-stealth

The playwright-stealth package patches common detection points:

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync

def scrape_with_stealth(url):
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context(
            viewport={"width": 1920, "height": 1080},
            screen={"width": 1920, "height": 1080},
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
            locale="en-US",
            timezone_id="America/New_York",
        )
        page = context.new_page()

        # Apply stealth patches
        stealth_sync(page)

        page.goto(url, wait_until="networkidle")
        content = page.content()
        browser.close()
        return content

html = scrape_with_stealth("https://bot.sannysoft.com")

Install with pip install playwright-stealth.

Fixing TLS Fingerprints

Python's requests library has a distinctive TLS fingerprint. Use curl_cffi to impersonate a real browser:

from curl_cffi import requests

# Impersonate Chrome's TLS fingerprint
session = requests.Session(impersonate="chrome124")
response = session.get("https://tls.browserleaks.com/json")
print(response.json()["ja3_hash"])

Testing Your Fingerprint

Use these sites to check what your scraper leaks:

  • bot.sannysoft.com, comprehensive headless detection tests
  • browserleaks.com, canvas, WebGL, font fingerprints
  • tls.browserleaks.com, TLS/JA3 fingerprint analysis

The Simplest Solution

Rather than fighting fingerprint detection yourself, use ScraperAPI or ScrapingAnt which maintain real browser pools with authentic fingerprints.

Key Takeaways

  • Fingerprinting catches what IP rotation and UA rotation miss
  • Always use playwright-stealth or equivalent patches with headless browsers
  • Fix your TLS fingerprint with curl_cffi for HTTP-level scraping
  • Test your fingerprint before scraping production targets