Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.32advanced6 min read

Browser Fingerprinting, A Complete Map

Every dimension along which anti-bot vendors fingerprint your scraper. A reference map for what you're up against, and what's hardest to spoof.

What you’ll learn

  • Enumerate the main fingerprint surfaces: TLS, HTTP, headers, JS APIs, canvas, audio, fonts.
  • Rank them by how easy each is for scrapers to fix.
  • Pick the right level of fingerprint hardening for a target.

Anti-bot vendors build a fingerprint from dozens of signals. Most are stable per browser/OS combo, hard to spoof coherently, and cheap to check. This lesson maps the territory.

The fingerprint hierarchy

From easiest-to-fix (top) to hardest-to-fix (bottom):

Layer Examples Difficulty to spoof
Headers User-Agent, Accept-Language, Sec-CH-UA Trivial
HTTP/2 settings SETTINGS frame, HPACK, frame order Moderate
TLS / JA3 / JA4 Cipher suite order, extensions, ALPN Moderate-hard
JavaScript navigator userAgent, languages, plugins, platform Moderate (browser only)
WebGL / Canvas GPU vendor, canvas pixel output Hard
Audio context AudioContext pixel-level output Hard
Fonts Available system fonts Hard
Timing Render frames, micro-benchmarks Very hard
Hardware concurrency / memory navigator.hardwareConcurrency, deviceMemory Hard
Battery / sensors battery API, accelerometer Hard

Each layer is observable by the server. Sophisticated anti-bot combines many, a "naturalness score" computed from coherent or incoherent signal combinations.

Layer 1: Headers

The cheapest and most-checked layer.

Visible signals:

  • User-Agent. The fingerprint floor.
  • Accept, Accept-Language, Accept-Encoding. Should match the UA's browser defaults.
  • Sec-CH-UA, Sec-CH-UA-Mobile, Sec-CH-UA-Platform. Client hints, Chrome sends, Safari does not.
  • Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-Dest, Sec-Fetch-User. Browser navigation context.
  • Order of headers. Chrome sends them in a specific order; raw HTTP libraries don't.

Coherence matters most. Claiming UA "Chrome on Mac" but omitting Sec-CH-UA (which Chrome sends) is a bot signal.

We cover header rotation thoroughly in §4.33.

Layer 2: HTTP/2 fingerprinting

HTTP/2 connections start with a SETTINGS frame. The values, their order, and the subsequent HEADERS frame's compression (HPACK) all vary per client:

  • Chrome's SETTINGS values are well-known.
  • curl's are different.
  • Python's httpx and requests are different from real browsers.

Anti-bot vendors fingerprint this, Akamai famously uses HTTP/2 signature scoring. Tools like curl-impersonate reimplement these signatures for select browsers.

Layer 3: TLS, JA3 and JA4

The TLS ClientHello packet contains:

  • TLS version.
  • Cipher suites (in client preference order).
  • Extensions (e.g. ALPN list, supported groups).
  • Elliptic curves.

JA3 (and the newer JA4) hashes these to a fingerprint string. Each browser has a known JA3:

  • Chrome 120 Windows: a specific hash.
  • Safari 17 Mac: a different hash.
  • Python httpx + native TLS: yet another, easily identified as "not a browser."

Defeating: use a TLS library that lets you customize ClientHello (e.g. curl-cffi, tls-client), or run a real browser via Playwright.

We cover TLS fingerprinting in detail in §4.34.

Layer 4: JavaScript navigator

Once the page loads JS, the browser exposes:

navigator.userAgent
navigator.languages
navigator.platform
navigator.hardwareConcurrency
navigator.deviceMemory
navigator.maxTouchPoints
navigator.plugins
navigator.mimeTypes

These should all match the claimed UA. A "Chrome on Mac" with navigator.platform = "Linux x86_64" is caught instantly.

Headless Chrome leaks signals (navigator.webdriver = true, missing chrome runtime). Stealth plugins (puppeteer-extra-plugin-stealth, playwright-stealth) patch these.

Layer 5: Canvas fingerprinting

A canvas element draws text or shapes. The pixel-level output varies by:

  • GPU vendor and driver.
  • Operating system.
  • Browser anti-aliasing settings.
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
ctx.fillText("Hello", 10, 10);
const fingerprint = canvas.toDataURL();  // unique per device class

Real machines produce stable, varied canvas hashes. Headless environments often produce identical or unusual hashes, flagging signal.

Mitigation: a real browser on a real machine. Spoofing canvas in headless Chrome with random noise per request actually backfires, real users have stable fingerprints; randomized canvases per request stand out.

The Catalog108 challenge at /challenges/antibot/canvas-fingerprint exercises this exact layer.

Layer 6: WebGL

Similar to canvas but uses the GPU's WebGL API. Reports:

  • GPU vendor (WEBGL_debug_renderer_info).
  • Driver version.
  • Supported extensions.
  • Render results of test scenes.

Again, headless or VM environments produce telltale signatures. Real browsers on real hardware look natural.

Layer 7: Audio fingerprinting

An OscillatorNode in the Web Audio API renders pixel-accurate audio data. Browser/OS differences yield distinctive fingerprints. Used less than canvas but still part of the broader fingerprint set.

Layer 8: Fonts

document.fonts.check() (or fallback measurement of rendered text width) reveals which fonts are installed. Headless Chrome on Linux has a distinctive font set; a real Mac user has another. The list distinguishes.

Layer 9: Hardware concurrency, memory, screen

navigator.hardwareConcurrency  // CPU cores
navigator.deviceMemory  // GB of RAM (rounded)
screen.width, screen.height
window.devicePixelRatio

Implausible combinations (96 cores, 0.5 GB RAM, 800x600 screen) flag immediately. Spoofing is possible in Playwright but coherence requires care.

Layer 10: Behavioral

The most expensive to fake:

  • Mouse movement patterns (real users curve; bots move linearly).
  • Scroll velocity profiles.
  • Time-on-page distribution.
  • Click timing.

Some anti-bot systems collect this telemetry and score the session. Defeating requires actually simulating human-like behavior, slow, expensive, fragile.

Practical hardening levels

How hard you should fight depends on the target:

Target tier Hardening
Internal API None, just use a UA
Static catalogue Headers + reasonable rate
Mid-tier e-commerce + TLS fingerprint (curl-cffi or browser), + residential IPs
Cloudflare-protected + browser via Playwright + stealth + good IPs
Major social platform + mobile proxies + behavioral simulation + sometimes commercial unblockers

Match the level to the actual problem. Over-hardening burns engineering time on protections you didn't need.

The "coherence" principle

The single most important fingerprint heuristic: coherence. Any of these mismatches flags you instantly:

  • UA = "Safari Mac", but Sec-CH-UA = present (Chrome only).
  • UA = "Chrome Windows", but TLS JA3 = "curl".
  • IP = residential US, but Accept-Language = "ru-RU".
  • Canvas fingerprint = randomized every request (real browsers are stable).

Spoofing one layer well beats randomizing many layers badly. A coherent fake Chrome on a residential IP works; a random hash collision of signals does not.

Hands-on lab

Visit /challenges/antibot/canvas-fingerprint on Catalog108. The page renders a canvas, computes its hash, sends it to the server.

  1. Hit it with a plain curl, note the response.
  2. Hit it with Playwright (default headless), note the response.
  3. Hit it with Playwright using playwright-stealth, note the response.

Each step shows what layers each tool covers. Working out which signal flagged you is the iteration loop of fingerprint hardening.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/antibot/canvas-fingerprint

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Browser Fingerprinting, A Complete Map1 / 8

Which is the SINGLE most important property of a successful fingerprint spoof?

Score so far: 0 / 0