Browser Fingerprinting, A Complete Map, Production, Scale & Career

Every dimension along which anti-bot vendors fingerprint your scraper. A reference map for what you're up against, and what's hardest to spoof.

Anti-bot vendors build a fingerprint from dozens of signals. Most are stable per browser/OS combo, hard to spoof coherently, and cheap to check. This lesson maps the territory.

The fingerprint hierarchy

From easiest-to-fix (top) to hardest-to-fix (bottom):

Layer	Examples	Difficulty to spoof
Headers	User-Agent, Accept-Language, Sec-CH-UA	Trivial
HTTP/2 settings	SETTINGS frame, HPACK, frame order	Moderate
TLS / JA3 / JA4	Cipher suite order, extensions, ALPN	Moderate-hard
JavaScript navigator	userAgent, languages, plugins, platform	Moderate (browser only)
WebGL / Canvas	GPU vendor, canvas pixel output	Hard
Audio context	AudioContext pixel-level output	Hard
Fonts	Available system fonts	Hard
Timing	Render frames, micro-benchmarks	Very hard
Hardware concurrency / memory	navigator.hardwareConcurrency, deviceMemory	Hard
Battery / sensors	battery API, accelerometer	Hard

Each layer is observable by the server. Sophisticated anti-bot combines many, a "naturalness score" computed from coherent or incoherent signal combinations.

Layer 1: Headers

The cheapest and most-checked layer.

Visible signals:

User-Agent. The fingerprint floor.
Accept, Accept-Language, Accept-Encoding. Should match the UA's browser defaults.
Sec-CH-UA, Sec-CH-UA-Mobile, Sec-CH-UA-Platform. Client hints, Chrome sends, Safari does not.
Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-Dest, Sec-Fetch-User. Browser navigation context.
Order of headers. Chrome sends them in a specific order; raw HTTP libraries don't.

Coherence matters most. Claiming UA "Chrome on Mac" but omitting Sec-CH-UA (which Chrome sends) is a bot signal.

We cover header rotation thoroughly in §4.33.

Layer 2: HTTP/2 fingerprinting

HTTP/2 connections start with a SETTINGS frame. The values, their order, and the subsequent HEADERS frame's compression (HPACK) all vary per client:

Chrome's SETTINGS values are well-known.
curl's are different.
Python's httpx and requests are different from real browsers.

Anti-bot vendors fingerprint this, Akamai famously uses HTTP/2 signature scoring. Tools like curl-impersonate reimplement these signatures for select browsers.

Layer 3: TLS, JA3 and JA4

The TLS ClientHello packet contains:

TLS version.
Cipher suites (in client preference order).
Extensions (e.g. ALPN list, supported groups).
Elliptic curves.

JA3 (and the newer JA4) hashes these to a fingerprint string. Each browser has a known JA3:

Chrome 120 Windows: a specific hash.
Safari 17 Mac: a different hash.
Python httpx + native TLS: yet another, easily identified as "not a browser."

Defeating: use a TLS library that lets you customize ClientHello (e.g. curl-cffi, tls-client), or run a real browser via Playwright.

We cover TLS fingerprinting in detail in §4.34.

Layer 4: JavaScript navigator

Once the page loads JS, the browser exposes:

navigator.userAgent
navigator.languages
navigator.platform
navigator.hardwareConcurrency
navigator.deviceMemory
navigator.maxTouchPoints
navigator.plugins
navigator.mimeTypes

These should all match the claimed UA. A "Chrome on Mac" with navigator.platform = "Linux x86_64" is caught instantly.

Headless Chrome leaks signals (navigator.webdriver = true, missing chrome runtime). Stealth plugins (puppeteer-extra-plugin-stealth, playwright-stealth) patch these.

Layer 5: Canvas fingerprinting

A canvas element draws text or shapes. The pixel-level output varies by:

GPU vendor and driver.
Operating system.
Browser anti-aliasing settings.

const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
ctx.fillText("Hello", 10, 10);
const fingerprint = canvas.toDataURL();  // unique per device class

Real machines produce stable, varied canvas hashes. Headless environments often produce identical or unusual hashes, flagging signal.

Mitigation: a real browser on a real machine. Spoofing canvas in headless Chrome with random noise per request actually backfires, real users have stable fingerprints; randomized canvases per request stand out.

The Catalog108 challenge at /challenges/antibot/canvas-fingerprint exercises this exact layer.

Layer 6: WebGL

Similar to canvas but uses the GPU's WebGL API. Reports:

GPU vendor (WEBGL_debug_renderer_info).
Driver version.
Supported extensions.
Render results of test scenes.

Again, headless or VM environments produce telltale signatures. Real browsers on real hardware look natural.

Layer 7: Audio fingerprinting

An OscillatorNode in the Web Audio API renders pixel-accurate audio data. Browser/OS differences yield distinctive fingerprints. Used less than canvas but still part of the broader fingerprint set.

Layer 8: Fonts

document.fonts.check() (or fallback measurement of rendered text width) reveals which fonts are installed. Headless Chrome on Linux has a distinctive font set; a real Mac user has another. The list distinguishes.

Layer 9: Hardware concurrency, memory, screen

navigator.hardwareConcurrency  // CPU cores
navigator.deviceMemory  // GB of RAM (rounded)
screen.width, screen.height
window.devicePixelRatio

Implausible combinations (96 cores, 0.5 GB RAM, 800x600 screen) flag immediately. Spoofing is possible in Playwright but coherence requires care.

Layer 10: Behavioral

The most expensive to fake:

Mouse movement patterns (real users curve; bots move linearly).
Scroll velocity profiles.
Time-on-page distribution.
Click timing.

Some anti-bot systems collect this telemetry and score the session. Defeating requires actually simulating human-like behavior, slow, expensive, fragile.

Practical hardening levels

How hard you should fight depends on the target:

Target tier	Hardening
Internal API	None, just use a UA
Static catalogue	Headers + reasonable rate
Mid-tier e-commerce	+ TLS fingerprint (curl-cffi or browser), + residential IPs
Cloudflare-protected	+ browser via Playwright + stealth + good IPs
Major social platform	+ mobile proxies + behavioral simulation + sometimes commercial unblockers

Match the level to the actual problem. Over-hardening burns engineering time on protections you didn't need.

The "coherence" principle

The single most important fingerprint heuristic: coherence. Any of these mismatches flags you instantly:

UA = "Safari Mac", but Sec-CH-UA = present (Chrome only).
UA = "Chrome Windows", but TLS JA3 = "curl".
IP = residential US, but Accept-Language = "ru-RU".
Canvas fingerprint = randomized every request (real browsers are stable).

Spoofing one layer well beats randomizing many layers badly. A coherent fake Chrome on a residential IP works; a random hash collision of signals does not.

Hands-on lab

Visit /challenges/antibot/canvas-fingerprint on Catalog108. The page renders a canvas, computes its hash, sends it to the server.

Hit it with a plain curl, note the response.
Hit it with Playwright (default headless), note the response.
Hit it with Playwright using playwright-stealth, note the response.

Each step shows what layers each tool covers. Working out which signal flagged you is the iteration loop of fingerprint hardening.

Browser Fingerprinting, A Complete Map

What you’ll learn

The fingerprint hierarchy

Layer 1: Headers

Layer 2: HTTP/2 fingerprinting

Layer 3: TLS, JA3 and JA4

Layer 4: JavaScript navigator

Layer 5: Canvas fingerprinting

Layer 6: WebGL

Layer 7: Audio fingerprinting

Layer 8: Fonts

Layer 9: Hardware concurrency, memory, screen

Layer 10: Behavioral

Practical hardening levels

The "coherence" principle

Hands-on lab

Hands-on lab

Quiz, check your understanding

Which is the SINGLE most important property of a successful fingerprint spoof?