User-Agents, Why They Matter, How to Set Them
The User-Agent header is the single biggest tell that you're a scraper. Learn what it's for, what real browsers send, and how to use it strategically.
What you’ll learn
- Read and decode a User-Agent string.
- Set a realistic UA on Python `requests` calls and sessions.
- Rotate User-Agents across requests when needed.
- Understand when a UA alone is enough, and when it isn't.
The User-Agent (UA) header is the client introducing itself to the server. It's the first thing many anti-bot systems look at. Get it wrong and you're blocked before any of your other tactics get a chance.
Anatomy of a UA string
A real Chrome on macOS sends:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
It looks chaotic. It's historical: every browser pretends to be every other browser for backwards compatibility. Real fields you can extract:
| Token | Meaning |
|---|---|
Mozilla/5.0 |
Legacy compatibility marker, every browser sends it |
Macintosh; Intel Mac OS X 10_15_7 |
OS platform |
AppleWebKit/537.36 |
Rendering engine |
Chrome/120.0.0.0 |
The actual browser and version |
Safari/537.36 |
Legacy compat, WebKit-derived browsers all claim Safari |
The default python-requests/2.31.0 lacks all of this structure, instantly recognisable as a non-browser. Many sites silently block or downgrade content for that UA.
Set a realistic UA
import requests
UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
r = requests.get(
"https://practice.scrapingcentral.com/products",
headers={"User-Agent": UA},
timeout=10,
)
Or, with a session:
s = requests.Session()
s.headers["User-Agent"] = UA
That single line solves about 30% of "my scraper gets a different page than my browser" complaints.
Where to get realistic UA strings
Three reliable sources:
- Your own browser: visit
https://practice.scrapingcentral.com/in Chrome/Firefox, open DevTools → Network tab → click any request → Request Headers → copy theUser-Agentvalue. - Public UA lists: user-agents.net,
useragentstring.com, kept up to date with current browser versions. - The
fake-useragentlibrary:
from fake_useragent import UserAgent
ua = UserAgent()
print(ua.chrome)
print(ua.random)
Use a UA that matches a current browser version. UAs claiming Chrome 50 (released 2016) are even more suspicious than the default python-requests UA.
Rotating UAs across requests
When scraping at volume, rotate UAs to look like many distinct users:
import random, requests
UA_POOL = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:122.0) Gecko/20100101 Firefox/122.0",
]
for url in urls:
headers = {"User-Agent": random.choice(UA_POOL)}
r = requests.get(url, headers=headers, timeout=10)
Caveat: rotation alone doesn't fool serious anti-bot. If the same IP rotates UAs every request, that's its OWN signal, real users don't change UA mid-session. Rotate UAs only when also rotating IPs (proxies, covered in Lesson 1.7), or rotate slowly (one UA per session, not per request).
When UA is enough, and when it isn't
UA helps against:
- Lazy blocklists that match
python-requests,curl,Go-http-client, etc. - Content negotiation servers that serve mobile vs desktop HTML based on UA.
- Naive scrape-detection dashboards looking at top-N UAs by request volume.
UA alone is NOT enough against:
- TLS fingerprinting (JA3, JA4), your UA says Chrome but your TLS handshake says Python.
- HTTP/2 fingerprinting, header order and casing leak the underlying library.
- Behavioural analysis, no mouse movement, perfectly even request timing.
- JavaScript challenges, Cloudflare's "checking your browser" page runs JS you can't.
The Production sub-path covers TLS and HTTP/2 fingerprint matching in depth (libraries like curl_cffi and tls-client). For static scraping, a realistic UA + a session + reasonable rate limiting handles most public sites.
Match the rest of the browser too
A consistent fingerprint means all headers agree. If you claim Chrome on Windows, send Chrome-on-Windows-style headers:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Sec-Ch-Ua": '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua-Platform": '"Windows"',
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Upgrade-Insecure-Requests": "1",
}
Yes, that's a lot. But for any site that's actually checking, the full set is the difference between "served the real page" and "served a blank shell." The Sec-Ch-Ua family is Chrome-only, Firefox doesn't send them. Match the browser, not just the UA token.
How Catalog108's UA-blocklist challenge works
The lab at /challenges/antibot/ua-blocklist rejects requests with bot-like UAs (python-requests, curl, wget, Go-http-client, default Java UA) with a 403. Sending a realistic browser UA passes. Use it as a controlled environment to verify your UA strategy works before pointing it at a real target.
Hands-on lab
Hit /challenges/antibot/ua-blocklist first with no UA override, confirm you get 403. Then add a realistic Chrome UA via headers={"User-Agent": "..."}, confirm you get 200. Finally, try a few outdated UAs (Chrome 50, IE 6) and see what the server does. The lab logs what UA you sent in the response body for inspection.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/challenges/antibot/ua-blocklistQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.