Copy as cURL → Working Python Request
Take a captured browser request and translate it to clean Python in under 60 seconds. The single most-used micro-skill of API scraping.
What you’ll learn
- Convert a captured curl command to idiomatic Python `requests` code.
- Recognise which curl flags map to which `requests` arguments.
- Use `curlconverter` and `httpie` for fast conversion.
- Avoid the four common translation bugs (JSON vs data, params encoding, cookies, compression).
This is the lesson where you start writing real scrapers. You captured a curl in F12; now you convert it to Python you can loop, retry, and ship.
The translation is mechanical once you've done it a few times. Get the mapping into muscle memory.
The captured curl
After right-clicking → Copy as cURL on Catalog108's /products page, you might get:
curl 'https://practice.scrapingcentral.com/api/products?page=1&category=mugs' \
-H 'accept: application/json' \
-H 'accept-encoding: gzip, deflate, br' \
-H 'accept-language: en-US,en;q=0.9' \
-H 'cookie: session=abc123' \
-H 'referer: https://practice.scrapingcentral.com/products' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36' \
--compressed
The Python translation, line by line
import requests
r = requests.get(
"https://practice.scrapingcentral.com/api/products",
params={"page": 1, "category": "mugs"},
headers={
"Accept": "application/json",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
"Referer": "https://practice.scrapingcentral.com/products",
},
cookies={"session": "abc123"},
)
data = r.json()
print(len(data["products"]), "products")
What changed:
- Query string (
?page=1&category=mugs) →params=dict.requestsURL-encodes for you. -Hheaders →headers=dict. Drop theaccept-encodingline (requests handles gzip/br automatically when you don't specify it).-H 'cookie: ...'→cookies=dict (parse out thename=valuepairs).--compressed→ no-op;requestsdecompresses by default.- Method,
curldefaults to GET; so doesrequests.get. If the curl was a POST (-X POST), userequests.post.
POST body translations
If the captured curl was a POST:
curl 'https://practice.scrapingcentral.com/api/auth/login' \
-X POST \
-H 'content-type: application/json' \
--data-raw '{"email":"student@practice.scrapingcentral.com","password":"practice123"}'
Python:
r = requests.post(
"https://practice.scrapingcentral.com/api/auth/login",
json={
"email": "student@practice.scrapingcentral.com",
"password": "practice123",
},
)
token = r.json()["access_token"]
Two notes:
- Use
json=(NOTdata=) when the content-type isapplication/json.requestswill serialize and set the header for you. - Use
data=for form-encoded bodies (-d 'email=...&password=...', content-typeapplication/x-www-form-urlencoded).
Mapping table, curl flags to requests arguments
| curl | requests |
|---|---|
-H 'name: value' |
headers={"Name": "value"} |
-b 'cookie=value' or -H 'cookie: ...' |
cookies={"cookie": "value"} |
--data-raw '...' (form-encoded) |
data="..." |
--data-raw '{"a":1}' (JSON) |
json={"a": 1} |
-X POST / -X PUT |
requests.post(...) / requests.put(...) |
-u user:pass |
auth=("user", "pass") |
--compressed |
(default; ignore) |
-L (follow redirects) |
(default; pass allow_redirects=False to disable) |
--proxy http://... |
proxies={"http": "...", "https": "..."} |
-k (insecure TLS) |
verify=False (don't use in production) |
--max-time 10 |
timeout=10 |
The four most common bugs
-
data=vsjson=. Beginners usedata=with a dict for a JSON endpoint.requestsURL-encodes the dict, the server getsemail=...&password=..., and rejects it with a 400 because the content-type doesn't match the body. Fix:json=for JSON endpoints. -
Manually building query strings. Don't:
# bad, easy to forget URL-encoding
url = f"https://...?page={page}&category={category}"
Use params= instead. It encodes spaces, ampersands, and unicode correctly.
-
Hard-coding cookies. The session cookie expires. Either capture a fresh login flow in your scraper (see lesson 3.16) or use a
requests.Session()that handles cookies after a login call. -
Missing the auth flow. The captured curl works because the session was already logged in. Your scraper has no session yet. Run the login first:
s = requests.Session()
s.post(".../api/auth/login", json={"email": "...", "password": "..."})
# now s has the cookies set; subsequent calls reuse them
r = s.get(".../api/products")
Tool-assisted conversion
For unfamiliar curls, paste into curlconverter.com, instant Python (and 12 other languages). For local conversion:
pip install curlconverter
echo "curl 'https://practice.scrapingcentral.com/api/products' -H 'accept: application/json'" \
| curlconverter -l python
The output is verbose, copies every header verbatim. Prune what's not needed (see lesson 3.8). Use auto-converters for speed, but read the output critically.
A 60-second loop
Combine the workflow from F12 with this lesson and you've got a complete scraper in about a minute:
- Open
/productsin browser, Network → Fetch/XHR, reload. - Right-click
/api/products→ Copy as cURL. - Paste in terminal, confirm it works.
- Translate to Python (manually for now, it builds intuition).
- Wrap in a loop:
import requests
def fetch_page(page):
r = requests.get(
"https://practice.scrapingcentral.com/api/products",
params={"page": page, "per_page": 50},
)
return r.json()["products"]
all_products = []
for page in range(1, 6):
all_products.extend(fetch_page(page))
print(len(all_products))
- Done.
Hands-on lab
Open /api/products in DevTools, copy as cURL, and translate to Python by hand. Run it. Then change one query parameter (per_page=50) and confirm the response is different. Then loop over page=1..5 and collect everything. You've just built a paginated API scraper, the same skill scales to thousands of records and dozens of endpoints.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/api/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.