Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

1.8beginner5 min read

Raw cURL in PHP, Foundations Every PHP Dev Must Know

The libcurl bindings ship with every PHP install. Master them, and every HTTP library you use later makes more sense.

What you’ll learn

  • Initialise a cURL handle and set options with `curl_setopt`.
  • Execute requests and read response data and headers.
  • Handle errors, timeouts, and HTTP status codes.
  • Understand why higher-level clients (Guzzle, Symfony) exist.

PHP ships with bindings to libcurl, the same library curl the command-line tool wraps. It's verbose, but it's everywhere, every PHP install has it, every host enables it, and every higher-level client builds on it. You should know how to use it raw at least once.

The four-step pattern

Every cURL call in PHP follows the same shape:

<?php
// 1. Init
$ch = curl_init('https://practice.scrapingcentral.com/products');

// 2. Configure
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; learning-scraper)');

// 3. Execute
$body = curl_exec($ch);

// 4. Close
curl_close($ch);

echo substr($body, 0, 500);

That's it. The verbosity comes from the dozens of options you can set; the core pattern is always init → setopt → exec → close.

The options that matter

Option Purpose
CURLOPT_URL Target URL (also settable in curl_init())
CURLOPT_RETURNTRANSFER Return the body from curl_exec instead of printing
CURLOPT_TIMEOUT Total time limit in seconds
CURLOPT_CONNECTTIMEOUT Time to establish the connection
CURLOPT_USERAGENT The User-Agent header
CURLOPT_FOLLOWLOCATION Follow 3xx redirects automatically
CURLOPT_MAXREDIRS Cap on redirect chain length
CURLOPT_HTTPHEADER Array of custom request headers
CURLOPT_COOKIEJAR File to write Set-Cookie data to
CURLOPT_COOKIEFILE File to read cookies from
CURLOPT_POST Switch to POST method
CURLOPT_POSTFIELDS Body for POST (array → form-encoded, string → raw)
CURLOPT_HEADER Include response headers in the returned body
CURLOPT_NOBODY HEAD request, headers only
CURLOPT_SSL_VERIFYPEER Verify the server's TLS cert (default true, leave it)
CURLOPT_PROXY Route through a proxy

CURLOPT_RETURNTRANSFER is the one you forget once and then never forget again. Without it, curl_exec prints the response directly to output and returns just true/false. With it (which is what you want 99% of the time), the body is returned as a string.

Inspect what you got back

$body  = curl_exec($ch);
$info  = curl_getinfo($ch);
$error = curl_error($ch);
$errno = curl_errno($ch);

print_r($info);
// Array
// (
//  [url] => https://practice.scrapingcentral.com/products
//  [http_code] => 200
//  [total_time] => 0.234
//  [primary_ip] => 185.93.228.150
//  [size_download] => 14823
//  [content_type] => text/html; charset=UTF-8
//  ...
// )

curl_getinfo($ch) returns an associative array with everything you need: status code, final URL, timing, content type, byte counts, IP info. Use it before curl_close.

curl_errno($ch) returns 0 on success. Non-zero with a non-empty curl_error($ch) means the request failed at the network or protocol layer (DNS, connection refused, timeout, TLS), not a 4xx/5xx (which IS a successful HTTP exchange that returned an error status).

Headers in, headers out

Send custom headers:

curl_setopt($ch, CURLOPT_HTTPHEADER, [
  'Accept: application/json',
  'X-Requested-With: XMLHttpRequest',
  'Authorization: Bearer ' . $token,
]);

Note the format: an array of "Name: value" strings, not an associative array. This is a quirk of the cURL bindings.

Read response headers, two options:

  1. Set CURLOPT_HEADER => true and curl_exec returns headers + body concatenated. You then have to split them yourself.
  2. Set CURLOPT_HEADERFUNCTION to a callback that's invoked once per header line. Cleaner for programmatic access:
$headers = [];
curl_setopt($ch, CURLOPT_HEADERFUNCTION, function ($ch, $line) use (&$headers) {
  $headers[] = $line;
  return strlen($line);
});

POST with form data

$ch = curl_init('https://practice.scrapingcentral.com/account/login');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query([
  'username' => 'student@practice.scrapingcentral.com',
  'password' => 'practice123',
]));
$body = curl_exec($ch);

http_build_query() URL-encodes the array into username=...&password=... form. Pass it as a string to CURLOPT_POSTFIELDS and cURL sends Content-Type: application/x-www-form-urlencoded automatically.

For JSON, set the header AND pass a JSON string:

$payload = json_encode(['name' => 'New product', 'price' => 9.99]);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_POSTFIELDS, $payload);

Cookies, the file-based approach

Persist cookies across multiple cURL calls:

$cookieFile = tempnam(sys_get_temp_dir(), 'cookies');

// First request, log in, write cookies to file
$ch = curl_init('https://practice.scrapingcentral.com/account/login');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query(['username' => '...', 'password' => '...']));
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
curl_close($ch);

// Second request, read cookies from file
$ch = curl_init('https://practice.scrapingcentral.com/dashboard');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);
$dashboard = curl_exec($ch);

Inelegant compared to Python's Session, but it works. You can also use one cURL handle for the entire session, set both COOKIEJAR and COOKIEFILE to the same file (or pass them in-memory via ''), and reuse the handle across calls with curl_setopt($ch, CURLOPT_URL, $newUrl).

When to leave raw cURL behind

Raw cURL is fine for:

  • Quick scripts where adding a Composer dep feels like overkill.
  • Hosts where you genuinely can't install Composer packages.
  • Debugging, understanding what higher-level libraries actually do.

For anything bigger, switch to Guzzle (Lesson 1.9) or Symfony HttpClient (Lesson 1.10). They wrap cURL but add structure, retries, middleware, async, and dramatically cleaner ergonomics.

A reusable wrapper

If you must stay on raw cURL, at least extract a helper:

function http_get(string $url, array $headers = [], int $timeout = 10): array {
  $ch = curl_init($url);
  curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_TIMEOUT  => $timeout,
  CURLOPT_USERAGENT  => 'Mozilla/5.0 (compatible; my-scraper)',
  CURLOPT_HTTPHEADER  => $headers,
  ]);
  $body = curl_exec($ch);
  $info = curl_getinfo($ch);
  $err  = curl_error($ch);
  curl_close($ch);
  if ($body === false) {
  throw new RuntimeException("cURL error: $err");
  }
  return ['body' => $body, 'status' => $info['http_code'], 'url' => $info['url']];
}

That gives you a Python-requests-shaped API in 15 lines.

Hands-on lab

Fetch /products with raw cURL. Print the HTTP status code from curl_getinfo, the content type, and the first 500 bytes of the body. Then add a CURLOPT_USERAGENT and confirm the response changes if you switch between a realistic browser UA and a deliberate bot-like one like 'curl/7.0'.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /products

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Raw cURL in PHP, Foundations Every PHP Dev Must Know1 / 8

What does `CURLOPT_RETURNTRANSFER => true` do?

Score so far: 0 / 0