User-Agent and Header Rotation (the Right Way)
Header rotation done badly is worse than not rotating. The principles that produce headers an anti-bot vendor can't distinguish from a real browser.
What you’ll learn
- Build coherent header bundles per browser version.
- Match header order, casing, and capitalization to the claimed UA.
- Avoid the classic random-UA pitfalls.
A scraper that picks a random UA per request and pairs it with default Python/PHP headers is worse than one that just uses the default UA. Anti-bot detection looks for coherence; random UAs over inconsistent headers are a louder signal than no rotation at all.
The right model: coherent bundles
Every "browser version on platform" emits a bundle of headers in a specific order. Capture the whole bundle, rotate at the bundle level.
BUNDLES = [
{
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate, br",
# Safari does NOT send Sec-CH-UA
},
{
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br, zstd",
"Sec-CH-UA": '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
"Sec-CH-UA-Mobile": "?0",
"Sec-CH-UA-Platform": '"Windows"',
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Upgrade-Insecure-Requests": "1",
},
]
import random
def get_bundle():
return random.choice(BUNDLES)
Use one whole bundle per session. Mixing Safari's UA with Chrome's Sec-CH-UA is incoherent and flagged.
Header order matters
Chrome sends headers in a specific order. Python's requests and httpx send them alphabetically (or however the dict orders them). Some anti-bot vendors check the order, alphabetical headers paired with a Chrome UA is a bot signal.
httpx preserves the order of the headers dict if you use a list of tuples:
headers = [
("Host", "..."),
("Connection", "keep-alive"),
("Cache-Control", "max-age=0"),
("Upgrade-Insecure-Requests", "1"),
("User-Agent", "Mozilla/5.0 ..."),
("Accept", "text/html..."),
("Sec-Fetch-Site", "none")...
]
client = httpx.Client(headers=headers)
The exact Chrome header order is publicly documented (e.g. by peet.ws/tools/http2). Match it.
Where to get accurate bundles
- Record headers from a real browser. Open DevTools → Network → right-click a request → "Copy as cURL." Convert to your scraper's syntax.
- Use a bundled list maintained by tools like
fake-useragent,user-agentsnpm, orwhatismybrowser.comuser-agent listings, but always verify by replaying through a real browser; lists go stale. - Tools like
curl-impersonateship coherent bundles internally.
Reseed Sec-Fetch-* per navigation context
Sec-Fetch-* headers depend on how the page was loaded:
- Direct URL bar entry:
Sec-Fetch-Site: none,Mode: navigate,User: ?1. - Same-origin link click:
Sec-Fetch-Site: same-origin. - Cross-origin navigation:
Sec-Fetch-Site: cross-site. - XHR/fetch from JS:
Sec-Fetch-Dest: empty,Mode: cors.
A scraper that always sends Sec-Fetch-Site: none looks like every request is typed in a fresh tab, unusual. Vary based on the navigation chain.
Accept-Language by region
A residential proxy in Germany should send Accept-Language: de-DE,de;q=0.9,en;q=0.5, not the default en-US. This is one of the most-checked coherence pairs.
Per-region bundles:
LANG_BY_COUNTRY = {
"US": "en-US,en;q=0.9",
"DE": "de-DE,de;q=0.9,en;q=0.5",
"FR": "fr-FR,fr;q=0.9,en;q=0.5",
"JP": "ja,en-US;q=0.5,en;q=0.3",
}
bundle["Accept-Language"] = LANG_BY_COUNTRY.get(proxy_country, "en-US,en;q=0.5")
PHP Symfony version
Same concept:
$bundle = [
'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.9',
'Accept-Encoding' => 'gzip, deflate, br',
'Sec-CH-UA' => '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'Sec-CH-UA-Mobile' => '?0',
'Sec-CH-UA-Platform' => '"Windows"',
'Sec-Fetch-Dest' => 'document',
'Sec-Fetch-Mode' => 'navigate',
'Sec-Fetch-Site' => 'none',
'Upgrade-Insecure-Requests' => '1',
];
$client = HttpClient::create([
'headers' => $bundle,
]);
Symfony's HttpClient preserves explicit header order. For coherence with Chrome's order, build the array in the exact sequence Chrome emits.
Caps and casing
HTTP header names are case-insensitive, but anti-bot vendors check the casing you send. Chrome sends User-Agent (Pascal-case); requests historically sent user-agent. Some libs default to lowercase. Verify by sniffing your own traffic with mitmproxy or Wireshark.
Pitfalls
1. Mixing browsers
UAS = ["...Chrome...", "...Safari...", "...Firefox..."]
ua = random.choice(UAS)
headers = {"User-Agent": ua, "Sec-CH-UA": chrome_ch} # WRONG when ua is Safari
Sec-CH-UA tied to Chrome but UA randomized across browsers = obvious mismatch.
2. Pretending to be old Chrome
UA strings for Chrome 90 paired with Sec-CH-UA values for Chrome 120 fail. Keep the bundle internally consistent, or just rotate among only-modern bundles.
3. Forgetting Accept-Encoding
If your client doesn't support br (brotli) or zstd, but your UA says it does, the response may be encoded in a way you can't decode. httpx supports br; aiohttp needs the brotli extra; some Python defaults don't. Test that decoded responses match expected content.
4. The "fake-useragent" trap
pip install fake-useragent gives you UA strings. Only UAs, not the accompanying headers. Using it alone for rotation creates the incoherence problem this lesson is about.
A practical pattern
class SessionHeaderManager:
def __init__(self):
self.session_headers = {}
def get(self, session_id, country="US"):
if session_id not in self.session_headers:
bundle = random.choice(BUNDLES)
bundle["Accept-Language"] = LANG_BY_COUNTRY.get(country, "en-US")
self.session_headers[session_id] = bundle
return self.session_headers[session_id]
One bundle per session, never changing. Across sessions, distributed selection. This is what a population of real users looks like.
Hands-on lab
Against /challenges/antibot/header-fingerprint:
- Send a request with default
httpxheaders, observe the response. - Send the same request with a coherent Chrome 120 Windows bundle, headers in Chrome's order.
- Now corrupt the bundle: keep the Chrome UA but switch
Sec-CH-UAto Firefox's (none). Observe the response.
You'll see exactly which signal the challenge detects. Use that intuition to harden production scrapers, fewer, better fakes beat many random ones.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/challenges/antibot/header-fingerprintQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.