Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

3.6beginner4 min read

Copy as cURL → Working Python Request

Take a captured browser request and translate it to clean Python in under 60 seconds. The single most-used micro-skill of API scraping.

What you’ll learn

  • Convert a captured curl command to idiomatic Python `requests` code.
  • Recognise which curl flags map to which `requests` arguments.
  • Use `curlconverter` and `httpie` for fast conversion.
  • Avoid the four common translation bugs (JSON vs data, params encoding, cookies, compression).

This is the lesson where you start writing real scrapers. You captured a curl in F12; now you convert it to Python you can loop, retry, and ship.

The translation is mechanical once you've done it a few times. Get the mapping into muscle memory.

The captured curl

After right-clicking → Copy as cURL on Catalog108's /products page, you might get:

curl 'https://practice.scrapingcentral.com/api/products?page=1&category=mugs' \
  -H 'accept: application/json' \
  -H 'accept-encoding: gzip, deflate, br' \
  -H 'accept-language: en-US,en;q=0.9' \
  -H 'cookie: session=abc123' \
  -H 'referer: https://practice.scrapingcentral.com/products' \
  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36' \
  --compressed

The Python translation, line by line

import requests

r = requests.get(
  "https://practice.scrapingcentral.com/api/products",
  params={"page": 1, "category": "mugs"},
  headers={
  "Accept": "application/json",
  "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
  "Referer": "https://practice.scrapingcentral.com/products",
  },
  cookies={"session": "abc123"},
)
data = r.json()
print(len(data["products"]), "products")

What changed:

  • Query string (?page=1&category=mugs) → params= dict. requests URL-encodes for you.
  • -H headersheaders= dict. Drop the accept-encoding line (requests handles gzip/br automatically when you don't specify it).
  • -H 'cookie: ...'cookies= dict (parse out the name=value pairs).
  • --compressed → no-op; requests decompresses by default.
  • Method, curl defaults to GET; so does requests.get. If the curl was a POST (-X POST), use requests.post.

POST body translations

If the captured curl was a POST:

curl 'https://practice.scrapingcentral.com/api/auth/login' \
  -X POST \
  -H 'content-type: application/json' \
  --data-raw '{"email":"student@practice.scrapingcentral.com","password":"practice123"}'

Python:

r = requests.post(
  "https://practice.scrapingcentral.com/api/auth/login",
  json={
  "email": "student@practice.scrapingcentral.com",
  "password": "practice123",
  },
)
token = r.json()["access_token"]

Two notes:

  • Use json= (NOT data=) when the content-type is application/json. requests will serialize and set the header for you.
  • Use data= for form-encoded bodies (-d 'email=...&password=...', content-type application/x-www-form-urlencoded).

Mapping table, curl flags to requests arguments

curl requests
-H 'name: value' headers={"Name": "value"}
-b 'cookie=value' or -H 'cookie: ...' cookies={"cookie": "value"}
--data-raw '...' (form-encoded) data="..."
--data-raw '{"a":1}' (JSON) json={"a": 1}
-X POST / -X PUT requests.post(...) / requests.put(...)
-u user:pass auth=("user", "pass")
--compressed (default; ignore)
-L (follow redirects) (default; pass allow_redirects=False to disable)
--proxy http://... proxies={"http": "...", "https": "..."}
-k (insecure TLS) verify=False (don't use in production)
--max-time 10 timeout=10

The four most common bugs

  1. data= vs json=. Beginners use data= with a dict for a JSON endpoint. requests URL-encodes the dict, the server gets email=...&password=..., and rejects it with a 400 because the content-type doesn't match the body. Fix: json= for JSON endpoints.

  2. Manually building query strings. Don't:

# bad, easy to forget URL-encoding
url = f"https://...?page={page}&category={category}"

Use params= instead. It encodes spaces, ampersands, and unicode correctly.

  1. Hard-coding cookies. The session cookie expires. Either capture a fresh login flow in your scraper (see lesson 3.16) or use a requests.Session() that handles cookies after a login call.

  2. Missing the auth flow. The captured curl works because the session was already logged in. Your scraper has no session yet. Run the login first:

s = requests.Session()
s.post(".../api/auth/login", json={"email": "...", "password": "..."})
# now s has the cookies set; subsequent calls reuse them
r = s.get(".../api/products")

Tool-assisted conversion

For unfamiliar curls, paste into curlconverter.com, instant Python (and 12 other languages). For local conversion:

pip install curlconverter
echo "curl 'https://practice.scrapingcentral.com/api/products' -H 'accept: application/json'" \
  | curlconverter -l python

The output is verbose, copies every header verbatim. Prune what's not needed (see lesson 3.8). Use auto-converters for speed, but read the output critically.

A 60-second loop

Combine the workflow from F12 with this lesson and you've got a complete scraper in about a minute:

  1. Open /products in browser, Network → Fetch/XHR, reload.
  2. Right-click /api/products → Copy as cURL.
  3. Paste in terminal, confirm it works.
  4. Translate to Python (manually for now, it builds intuition).
  5. Wrap in a loop:
import requests

def fetch_page(page):
r = requests.get(
"https://practice.scrapingcentral.com/api/products",
params={"page": page, "per_page": 50},
)
return r.json()["products"]

all_products = []
for page in range(1, 6):
all_products.extend(fetch_page(page))
print(len(all_products))
  1. Done.

Hands-on lab

Open /api/products in DevTools, copy as cURL, and translate to Python by hand. Run it. Then change one query parameter (per_page=50) and confirm the response is different. Then loop over page=1..5 and collect everything. You've just built a paginated API scraper, the same skill scales to thousands of records and dozens of endpoints.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /api/products

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Copy as cURL → Working Python Request1 / 8

Which `requests` argument should you use for a POST with `content-type: application/json`?

Score so far: 0 / 0