Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

1.5beginner5 min read

User-Agents, Why They Matter, How to Set Them

The User-Agent header is the single biggest tell that you're a scraper. Learn what it's for, what real browsers send, and how to use it strategically.

What you’ll learn

  • Read and decode a User-Agent string.
  • Set a realistic UA on Python `requests` calls and sessions.
  • Rotate User-Agents across requests when needed.
  • Understand when a UA alone is enough, and when it isn't.

The User-Agent (UA) header is the client introducing itself to the server. It's the first thing many anti-bot systems look at. Get it wrong and you're blocked before any of your other tactics get a chance.

Anatomy of a UA string

A real Chrome on macOS sends:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

It looks chaotic. It's historical: every browser pretends to be every other browser for backwards compatibility. Real fields you can extract:

Token Meaning
Mozilla/5.0 Legacy compatibility marker, every browser sends it
Macintosh; Intel Mac OS X 10_15_7 OS platform
AppleWebKit/537.36 Rendering engine
Chrome/120.0.0.0 The actual browser and version
Safari/537.36 Legacy compat, WebKit-derived browsers all claim Safari

The default python-requests/2.31.0 lacks all of this structure, instantly recognisable as a non-browser. Many sites silently block or downgrade content for that UA.

Set a realistic UA

import requests

UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"

r = requests.get(
  "https://practice.scrapingcentral.com/products",
  headers={"User-Agent": UA},
  timeout=10,
)

Or, with a session:

s = requests.Session()
s.headers["User-Agent"] = UA

That single line solves about 30% of "my scraper gets a different page than my browser" complaints.

Where to get realistic UA strings

Three reliable sources:

  1. Your own browser: visit https://practice.scrapingcentral.com/ in Chrome/Firefox, open DevTools → Network tab → click any request → Request Headers → copy the User-Agent value.
  2. Public UA lists: user-agents.net, useragentstring.com, kept up to date with current browser versions.
  3. The fake-useragent library:
from fake_useragent import UserAgent
ua = UserAgent()
print(ua.chrome)
print(ua.random)

Use a UA that matches a current browser version. UAs claiming Chrome 50 (released 2016) are even more suspicious than the default python-requests UA.

Rotating UAs across requests

When scraping at volume, rotate UAs to look like many distinct users:

import random, requests

UA_POOL = [
  "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
  "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
  "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
  "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:122.0) Gecko/20100101 Firefox/122.0",
]

for url in urls:
  headers = {"User-Agent": random.choice(UA_POOL)}
  r = requests.get(url, headers=headers, timeout=10)

Caveat: rotation alone doesn't fool serious anti-bot. If the same IP rotates UAs every request, that's its OWN signal, real users don't change UA mid-session. Rotate UAs only when also rotating IPs (proxies, covered in Lesson 1.7), or rotate slowly (one UA per session, not per request).

When UA is enough, and when it isn't

UA helps against:

  • Lazy blocklists that match python-requests, curl, Go-http-client, etc.
  • Content negotiation servers that serve mobile vs desktop HTML based on UA.
  • Naive scrape-detection dashboards looking at top-N UAs by request volume.

UA alone is NOT enough against:

  • TLS fingerprinting (JA3, JA4), your UA says Chrome but your TLS handshake says Python.
  • HTTP/2 fingerprinting, header order and casing leak the underlying library.
  • Behavioural analysis, no mouse movement, perfectly even request timing.
  • JavaScript challenges, Cloudflare's "checking your browser" page runs JS you can't.

The Production sub-path covers TLS and HTTP/2 fingerprint matching in depth (libraries like curl_cffi and tls-client). For static scraping, a realistic UA + a session + reasonable rate limiting handles most public sites.

Match the rest of the browser too

A consistent fingerprint means all headers agree. If you claim Chrome on Windows, send Chrome-on-Windows-style headers:

headers = {
  "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
  "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
  "Accept-Language": "en-US,en;q=0.9",
  "Accept-Encoding": "gzip, deflate, br",
  "Sec-Ch-Ua": '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
  "Sec-Ch-Ua-Mobile": "?0",
  "Sec-Ch-Ua-Platform": '"Windows"',
  "Sec-Fetch-Dest": "document",
  "Sec-Fetch-Mode": "navigate",
  "Sec-Fetch-Site": "none",
  "Sec-Fetch-User": "?1",
  "Upgrade-Insecure-Requests": "1",
}

Yes, that's a lot. But for any site that's actually checking, the full set is the difference between "served the real page" and "served a blank shell." The Sec-Ch-Ua family is Chrome-only, Firefox doesn't send them. Match the browser, not just the UA token.

How Catalog108's UA-blocklist challenge works

The lab at /challenges/antibot/ua-blocklist rejects requests with bot-like UAs (python-requests, curl, wget, Go-http-client, default Java UA) with a 403. Sending a realistic browser UA passes. Use it as a controlled environment to verify your UA strategy works before pointing it at a real target.

Hands-on lab

Hit /challenges/antibot/ua-blocklist first with no UA override, confirm you get 403. Then add a realistic Chrome UA via headers={"User-Agent": "..."}, confirm you get 200. Finally, try a few outdated UAs (Chrome 50, IE 6) and see what the server does. The lab logs what UA you sent in the response body for inspection.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/antibot/ua-blocklist

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

User-Agents, Why They Matter, How to Set Them1 / 8

What does `Mozilla/5.0` at the start of every modern browser's User-Agent indicate?

Score so far: 0 / 0