Raw cURL in PHP, Foundations Every PHP Dev Must Know
The libcurl bindings ship with every PHP install. Master them, and every HTTP library you use later makes more sense.
What you’ll learn
- Initialise a cURL handle and set options with `curl_setopt`.
- Execute requests and read response data and headers.
- Handle errors, timeouts, and HTTP status codes.
- Understand why higher-level clients (Guzzle, Symfony) exist.
PHP ships with bindings to libcurl, the same library curl the command-line tool wraps. It's verbose, but it's everywhere, every PHP install has it, every host enables it, and every higher-level client builds on it. You should know how to use it raw at least once.
The four-step pattern
Every cURL call in PHP follows the same shape:
<?php
// 1. Init
$ch = curl_init('https://practice.scrapingcentral.com/products');
// 2. Configure
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; learning-scraper)');
// 3. Execute
$body = curl_exec($ch);
// 4. Close
curl_close($ch);
echo substr($body, 0, 500);
That's it. The verbosity comes from the dozens of options you can set; the core pattern is always init → setopt → exec → close.
The options that matter
| Option | Purpose |
|---|---|
CURLOPT_URL |
Target URL (also settable in curl_init()) |
CURLOPT_RETURNTRANSFER |
Return the body from curl_exec instead of printing |
CURLOPT_TIMEOUT |
Total time limit in seconds |
CURLOPT_CONNECTTIMEOUT |
Time to establish the connection |
CURLOPT_USERAGENT |
The User-Agent header |
CURLOPT_FOLLOWLOCATION |
Follow 3xx redirects automatically |
CURLOPT_MAXREDIRS |
Cap on redirect chain length |
CURLOPT_HTTPHEADER |
Array of custom request headers |
CURLOPT_COOKIEJAR |
File to write Set-Cookie data to |
CURLOPT_COOKIEFILE |
File to read cookies from |
CURLOPT_POST |
Switch to POST method |
CURLOPT_POSTFIELDS |
Body for POST (array → form-encoded, string → raw) |
CURLOPT_HEADER |
Include response headers in the returned body |
CURLOPT_NOBODY |
HEAD request, headers only |
CURLOPT_SSL_VERIFYPEER |
Verify the server's TLS cert (default true, leave it) |
CURLOPT_PROXY |
Route through a proxy |
CURLOPT_RETURNTRANSFER is the one you forget once and then never forget again. Without it, curl_exec prints the response directly to output and returns just true/false. With it (which is what you want 99% of the time), the body is returned as a string.
Inspect what you got back
$body = curl_exec($ch);
$info = curl_getinfo($ch);
$error = curl_error($ch);
$errno = curl_errno($ch);
print_r($info);
// Array
// (
// [url] => https://practice.scrapingcentral.com/products
// [http_code] => 200
// [total_time] => 0.234
// [primary_ip] => 185.93.228.150
// [size_download] => 14823
// [content_type] => text/html; charset=UTF-8
// ...
// )
curl_getinfo($ch) returns an associative array with everything you need: status code, final URL, timing, content type, byte counts, IP info. Use it before curl_close.
curl_errno($ch) returns 0 on success. Non-zero with a non-empty curl_error($ch) means the request failed at the network or protocol layer (DNS, connection refused, timeout, TLS), not a 4xx/5xx (which IS a successful HTTP exchange that returned an error status).
Headers in, headers out
Send custom headers:
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'Accept: application/json',
'X-Requested-With: XMLHttpRequest',
'Authorization: Bearer ' . $token,
]);
Note the format: an array of "Name: value" strings, not an associative array. This is a quirk of the cURL bindings.
Read response headers, two options:
- Set
CURLOPT_HEADER => trueandcurl_execreturns headers + body concatenated. You then have to split them yourself. - Set
CURLOPT_HEADERFUNCTIONto a callback that's invoked once per header line. Cleaner for programmatic access:
$headers = [];
curl_setopt($ch, CURLOPT_HEADERFUNCTION, function ($ch, $line) use (&$headers) {
$headers[] = $line;
return strlen($line);
});
POST with form data
$ch = curl_init('https://practice.scrapingcentral.com/account/login');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query([
'username' => 'student@practice.scrapingcentral.com',
'password' => 'practice123',
]));
$body = curl_exec($ch);
http_build_query() URL-encodes the array into username=...&password=... form. Pass it as a string to CURLOPT_POSTFIELDS and cURL sends Content-Type: application/x-www-form-urlencoded automatically.
For JSON, set the header AND pass a JSON string:
$payload = json_encode(['name' => 'New product', 'price' => 9.99]);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_POSTFIELDS, $payload);
Cookies, the file-based approach
Persist cookies across multiple cURL calls:
$cookieFile = tempnam(sys_get_temp_dir(), 'cookies');
// First request, log in, write cookies to file
$ch = curl_init('https://practice.scrapingcentral.com/account/login');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query(['username' => '...', 'password' => '...']));
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
curl_close($ch);
// Second request, read cookies from file
$ch = curl_init('https://practice.scrapingcentral.com/dashboard');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);
$dashboard = curl_exec($ch);
Inelegant compared to Python's Session, but it works. You can also use one cURL handle for the entire session, set both COOKIEJAR and COOKIEFILE to the same file (or pass them in-memory via ''), and reuse the handle across calls with curl_setopt($ch, CURLOPT_URL, $newUrl).
When to leave raw cURL behind
Raw cURL is fine for:
- Quick scripts where adding a Composer dep feels like overkill.
- Hosts where you genuinely can't install Composer packages.
- Debugging, understanding what higher-level libraries actually do.
For anything bigger, switch to Guzzle (Lesson 1.9) or Symfony HttpClient (Lesson 1.10). They wrap cURL but add structure, retries, middleware, async, and dramatically cleaner ergonomics.
A reusable wrapper
If you must stay on raw cURL, at least extract a helper:
function http_get(string $url, array $headers = [], int $timeout = 10): array {
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_TIMEOUT => $timeout,
CURLOPT_USERAGENT => 'Mozilla/5.0 (compatible; my-scraper)',
CURLOPT_HTTPHEADER => $headers,
]);
$body = curl_exec($ch);
$info = curl_getinfo($ch);
$err = curl_error($ch);
curl_close($ch);
if ($body === false) {
throw new RuntimeException("cURL error: $err");
}
return ['body' => $body, 'status' => $info['http_code'], 'url' => $info['url']];
}
That gives you a Python-requests-shaped API in 15 lines.
Hands-on lab
Fetch /products with raw cURL. Print the HTTP status code from curl_getinfo, the content type, and the first 500 bytes of the body. Then add a CURLOPT_USERAGENT and confirm the response changes if you switch between a realistic browser UA and a deliberate bot-like one like 'curl/7.0'.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.