Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Tutorial

Web Scraping Headers and Cookies - Complete Guide

Master HTTP headers and cookies for web scraping. Learn which headers to set, how to manage cookies, and how to avoid detection.

Proper HTTP headers and cookie management are fundamental to successful web scraping. They determine whether a website treats your scraper as a legitimate browser or blocks it as a bot.

Essential HTTP Headers

User-Agent

The most important header. Without it, most sites immediately block you.

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
}

Full Browser-Like Headers

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "none",
    "Sec-Fetch-User": "?1",
    "DNT": "1",
}

Rotating User Agents

Use different user agents to avoid fingerprinting.

import random

user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/125.0.0.0",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 Chrome/125.0.0.0",
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Chrome/125.0.0.0",
]

headers["User-Agent"] = random.choice(user_agents)

Cookie Management

Using Sessions

import requests

session = requests.Session()

# First request sets cookies
session.get("https://example.com")

# Subsequent requests include cookies automatically
resp = session.get("https://example.com/data")

Manual Cookie Handling

cookies = {
    "session_id": "abc123",
    "consent": "accepted",
}

resp = requests.get("https://example.com", cookies=cookies)

Headers That Get You Blocked

Missing Header Risk Level Effect
No User-Agent Critical Instant block
No Accept-Language Medium Flagged as suspicious
Wrong Referer Medium May get redirected
No Sec-Fetch headers Low-Medium Detected by advanced systems

The Easy Way: Let ScraperAPI Handle It

ScraperAPI automatically sets appropriate headers, manages cookies, and rotates fingerprints.

# No need to manage headers or cookies manually
resp = requests.get(
    f"http://api.scraperapi.com?api_key={API_KEY}&url=https://example.com"
)

Best Practices

  1. Always set a User-Agent, The bare minimum for any scraper
  2. Use requests.Session(), Maintains cookies across requests automatically
  3. Match header order, Some sites check header ordering
  4. Use ScrapingAnt or ScraperAPI to automate header management
  5. Check response codes, A 403 usually means your headers are wrong
  6. Copy headers from your browser, Use DevTools to see exactly what headers your browser sends