Copy as cURL and Why It's a Superpower
Right-click → Copy → Copy as cURL. The single most time-saving habit in web scraping. Here's how to use it, what to strip, and how to translate to Python or PHP.
What you’ll learn
- Capture any browser request as a one-line curl command.
- Identify the minimum viable subset of headers, what to keep, what to throw away.
- Translate a captured curl command to Python `requests` and PHP Guzzle.
- Use the result as the starting point for a programmatic scraper, not a once-off.
The single most useful workflow in scraping:
- Reproduce the data fetch in your browser.
- Network panel → right-click the request → Copy → Copy as cURL (or "Copy as cURL (bash)" on Windows).
- Paste into your terminal. It runs. You see the JSON.
- Translate to your language. Ship.
You skip every "what headers does this endpoint need," "do I need to log in first," "what's the auth flow" question. The browser already worked it out, you're just reading off the answer.
What Copy as cURL gives you
A captured request might look like:
curl 'https://practice.scrapingcentral.com/api/products?page=2&category=mugs' \
-H 'accept: application/json' \
-H 'accept-language: en-US,en;q=0.9' \
-H 'cookie: session=abc123; csrf=xyz789' \
-H 'referer: https://practice.scrapingcentral.com/products' \
-H 'sec-ch-ua: "Chromium";v="120"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "macOS"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: same-origin' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 ...' \
--compressed
15 headers, most of them noise. The art is figuring out which are load-bearing.
Minimum viable request, strip everything
Run the curl as-is. It works. Now start deleting headers one at a time and re-running. When the response changes, you've found a required header.
The pattern almost always reduces to a small subset:
| Header | Usually required? | Notes |
|---|---|---|
Cookie |
Yes, when auth is involved | Strip if endpoint is public |
User-Agent |
Sometimes | Some endpoints 403 without a browser-shaped UA |
Referer |
Occasionally | Anti-leech endpoints check this |
Accept |
Rarely | Default is usually fine |
Authorization |
Yes, when bearer auth | The real auth, separate from cookies |
X-CSRF-Token |
On POST forms only | |
Origin |
On cross-origin POSTs | CORS preflight |
The sec-ch-ua-* headers are Client Hints, most servers ignore them. The sec-fetch-* headers are usually ignorable. The accept-language, accept-encoding are nice-to-have. Delete one, re-test, repeat.
Aim for the smallest set that still works. A 3-header request is faster, less fingerprint-able, and easier to maintain than a 15-header one.
Translating to Python requests
Once you've trimmed the curl, the Python is mechanical:
import requests
r = requests.get(
"https://practice.scrapingcentral.com/api/products",
params={"page": 2, "category": "mugs"},
headers={
"Accept": "application/json",
"Cookie": "session=abc123",
},
)
print(r.status_code, r.json())
Notes:
- Query parameters go in
params, not in the URL string. Cleaner, auto-encoded. headersis a dict, copy each-Hline, splitting on:.--data-raw '{"foo":"bar"}'becomesjson={"foo":"bar"}(ordata=...for form-encoded).requestsdecompresses gzip/br automatically; ignore--compressed.
Translating to PHP Guzzle
Same process:
use GuzzleHttp\Client;
$client = new Client(['base_uri' => 'https://practice.scrapingcentral.com']);
$res = $client->get('/api/products', [
'query' => ['page' => 2, 'category' => 'mugs'],
'headers' => [
'Accept' => 'application/json',
'Cookie' => 'session=abc123',
],
]);
$data = json_decode($res->getBody()->getContents(), true);
Both translations are mostly typing. The hard work, discovering the URL, headers, and auth, happened in DevTools.
Tools that automate the translation
When you're learning, type the translation yourself; it builds intuition. After that, automate:
- curlconverter.com, paste a curl, get Python / PHP / Node / Java / Ruby. Free, in-browser.
pip install curlconverterthencurlconverter -l python <(echo "<curl>"), local version.- Most IDEs have a "Convert curl" plugin.
The auto-converters are good. Read the output and prune unnecessary headers manually, they tend to copy everything verbatim.
The "Copy as cURL" pitfalls
Three to watch:
-
Cookies expire. The captured cookie works until the session times out. Your scraper that ships the cookie hard-coded will silently break in a day. Either capture the auth flow too (Sub-Path 3) or refresh manually.
-
CSRF tokens are single-use. A POST that worked once with a captured CSRF token will fail on the next call, the server expects a new token per request. Capture the flow that issues the token, not just the final POST.
-
Anti-bot endpoints check fingerprints, not just headers. Some endpoints look at TLS fingerprint, header order, header casing. Your curl call passes; your Python call fails because Python sorts headers differently. Falls under TLS fingerprinting, covered in Sub-Path 3.
Variants of Copy as
DevTools also offers:
- Copy as fetch, produces a JS
fetch()call. Useful for pasting back into the Console to re-test in the same browser context. - Copy as PowerShell, Windows native shell command (uses
Invoke-WebRequest). - Copy as cURL (cmd), Windows cmd-friendly escaping.
- Copy response, just the body. Useful when prototyping a parser without re-fetching.
Mac and Linux: use the plain "Copy as cURL." Windows users: "Copy as cURL (bash)" if you have WSL, otherwise "(cmd)."
Putting it together: a 60-second workflow
- Visit
practice.scrapingcentral.com/productsin a browser. - DevTools → Network → Fetch/XHR.
- Reload. Find the
/api/productscall. - Right-click → Copy as cURL.
- Paste in terminal. Run. See JSON.
- Delete
sec-ch-ua*,sec-fetch-*,accept-language,accept-encoding. Re-run. Still works. - Translate to Python. Five lines.
- Add a loop over
?page=N. Twenty lines. - Done. The scraper is built. Time elapsed: under 5 minutes.
Hands-on lab
Open practice.scrapingcentral.com/products in your browser, find the XHR that loads the catalog, and Copy as cURL. Run it in your terminal, confirm you get JSON. Then start stripping headers one at a time until you have the minimum set. Translate that minimum to Python requests. Loop over ?page=1..N. You've just built your first real API-style scraper without touching HTML.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.