Browser Fingerprinting, A Complete Map
Every dimension along which anti-bot vendors fingerprint your scraper. A reference map for what you're up against, and what's hardest to spoof.
What you’ll learn
- Enumerate the main fingerprint surfaces: TLS, HTTP, headers, JS APIs, canvas, audio, fonts.
- Rank them by how easy each is for scrapers to fix.
- Pick the right level of fingerprint hardening for a target.
Anti-bot vendors build a fingerprint from dozens of signals. Most are stable per browser/OS combo, hard to spoof coherently, and cheap to check. This lesson maps the territory.
The fingerprint hierarchy
From easiest-to-fix (top) to hardest-to-fix (bottom):
| Layer | Examples | Difficulty to spoof |
|---|---|---|
| Headers | User-Agent, Accept-Language, Sec-CH-UA | Trivial |
| HTTP/2 settings | SETTINGS frame, HPACK, frame order | Moderate |
| TLS / JA3 / JA4 | Cipher suite order, extensions, ALPN | Moderate-hard |
| JavaScript navigator | userAgent, languages, plugins, platform | Moderate (browser only) |
| WebGL / Canvas | GPU vendor, canvas pixel output | Hard |
| Audio context | AudioContext pixel-level output | Hard |
| Fonts | Available system fonts | Hard |
| Timing | Render frames, micro-benchmarks | Very hard |
| Hardware concurrency / memory | navigator.hardwareConcurrency, deviceMemory | Hard |
| Battery / sensors | battery API, accelerometer | Hard |
Each layer is observable by the server. Sophisticated anti-bot combines many, a "naturalness score" computed from coherent or incoherent signal combinations.
Layer 1: Headers
The cheapest and most-checked layer.
Visible signals:
- User-Agent. The fingerprint floor.
- Accept, Accept-Language, Accept-Encoding. Should match the UA's browser defaults.
- Sec-CH-UA, Sec-CH-UA-Mobile, Sec-CH-UA-Platform. Client hints, Chrome sends, Safari does not.
- Sec-Fetch-Site, Sec-Fetch-Mode, Sec-Fetch-Dest, Sec-Fetch-User. Browser navigation context.
- Order of headers. Chrome sends them in a specific order; raw HTTP libraries don't.
Coherence matters most. Claiming UA "Chrome on Mac" but omitting Sec-CH-UA (which Chrome sends) is a bot signal.
We cover header rotation thoroughly in §4.33.
Layer 2: HTTP/2 fingerprinting
HTTP/2 connections start with a SETTINGS frame. The values, their order, and the subsequent HEADERS frame's compression (HPACK) all vary per client:
- Chrome's SETTINGS values are well-known.
- curl's are different.
- Python's
httpxandrequestsare different from real browsers.
Anti-bot vendors fingerprint this, Akamai famously uses HTTP/2 signature scoring. Tools like curl-impersonate reimplement these signatures for select browsers.
Layer 3: TLS, JA3 and JA4
The TLS ClientHello packet contains:
- TLS version.
- Cipher suites (in client preference order).
- Extensions (e.g. ALPN list, supported groups).
- Elliptic curves.
JA3 (and the newer JA4) hashes these to a fingerprint string. Each browser has a known JA3:
- Chrome 120 Windows: a specific hash.
- Safari 17 Mac: a different hash.
- Python httpx + native TLS: yet another, easily identified as "not a browser."
Defeating: use a TLS library that lets you customize ClientHello (e.g. curl-cffi, tls-client), or run a real browser via Playwright.
We cover TLS fingerprinting in detail in §4.34.
Layer 4: JavaScript navigator
Once the page loads JS, the browser exposes:
navigator.userAgent
navigator.languages
navigator.platform
navigator.hardwareConcurrency
navigator.deviceMemory
navigator.maxTouchPoints
navigator.plugins
navigator.mimeTypes
These should all match the claimed UA. A "Chrome on Mac" with navigator.platform = "Linux x86_64" is caught instantly.
Headless Chrome leaks signals (navigator.webdriver = true, missing chrome runtime). Stealth plugins (puppeteer-extra-plugin-stealth, playwright-stealth) patch these.
Layer 5: Canvas fingerprinting
A canvas element draws text or shapes. The pixel-level output varies by:
- GPU vendor and driver.
- Operating system.
- Browser anti-aliasing settings.
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
ctx.fillText("Hello", 10, 10);
const fingerprint = canvas.toDataURL(); // unique per device class
Real machines produce stable, varied canvas hashes. Headless environments often produce identical or unusual hashes, flagging signal.
Mitigation: a real browser on a real machine. Spoofing canvas in headless Chrome with random noise per request actually backfires, real users have stable fingerprints; randomized canvases per request stand out.
The Catalog108 challenge at /challenges/antibot/canvas-fingerprint exercises this exact layer.
Layer 6: WebGL
Similar to canvas but uses the GPU's WebGL API. Reports:
- GPU vendor (
WEBGL_debug_renderer_info). - Driver version.
- Supported extensions.
- Render results of test scenes.
Again, headless or VM environments produce telltale signatures. Real browsers on real hardware look natural.
Layer 7: Audio fingerprinting
An OscillatorNode in the Web Audio API renders pixel-accurate audio data. Browser/OS differences yield distinctive fingerprints. Used less than canvas but still part of the broader fingerprint set.
Layer 8: Fonts
document.fonts.check() (or fallback measurement of rendered text width) reveals which fonts are installed. Headless Chrome on Linux has a distinctive font set; a real Mac user has another. The list distinguishes.
Layer 9: Hardware concurrency, memory, screen
navigator.hardwareConcurrency // CPU cores
navigator.deviceMemory // GB of RAM (rounded)
screen.width, screen.height
window.devicePixelRatio
Implausible combinations (96 cores, 0.5 GB RAM, 800x600 screen) flag immediately. Spoofing is possible in Playwright but coherence requires care.
Layer 10: Behavioral
The most expensive to fake:
- Mouse movement patterns (real users curve; bots move linearly).
- Scroll velocity profiles.
- Time-on-page distribution.
- Click timing.
Some anti-bot systems collect this telemetry and score the session. Defeating requires actually simulating human-like behavior, slow, expensive, fragile.
Practical hardening levels
How hard you should fight depends on the target:
| Target tier | Hardening |
|---|---|
| Internal API | None, just use a UA |
| Static catalogue | Headers + reasonable rate |
| Mid-tier e-commerce | + TLS fingerprint (curl-cffi or browser), + residential IPs |
| Cloudflare-protected | + browser via Playwright + stealth + good IPs |
| Major social platform | + mobile proxies + behavioral simulation + sometimes commercial unblockers |
Match the level to the actual problem. Over-hardening burns engineering time on protections you didn't need.
The "coherence" principle
The single most important fingerprint heuristic: coherence. Any of these mismatches flags you instantly:
- UA = "Safari Mac", but Sec-CH-UA = present (Chrome only).
- UA = "Chrome Windows", but TLS JA3 = "curl".
- IP = residential US, but Accept-Language = "ru-RU".
- Canvas fingerprint = randomized every request (real browsers are stable).
Spoofing one layer well beats randomizing many layers badly. A coherent fake Chrome on a residential IP works; a random hash collision of signals does not.
Hands-on lab
Visit /challenges/antibot/canvas-fingerprint on Catalog108. The page renders a canvas, computes its hash, sends it to the server.
- Hit it with a plain
curl, note the response. - Hit it with Playwright (default headless), note the response.
- Hit it with Playwright using
playwright-stealth, note the response.
Each step shows what layers each tool covers. Working out which signal flagged you is the iteration loop of fingerprint hardening.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/challenges/antibot/canvas-fingerprintQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.