Copy as cURL → Working PHP Request (Guzzle)
Same captured curl, translated to idiomatic PHP with Guzzle. The minimum-viable client a senior PHP scraper writes.
What you’ll learn
- Convert curl headers, query strings, and bodies to Guzzle options.
- Use base_uri, query, headers, json, and form_params correctly.
- Handle cookies via Guzzle's CookieJar.
- Recognise the most common Guzzle translation bugs.
The PHP companion to the previous lesson. Guzzle is the de-facto HTTP client for PHP, every Symfony, Laravel, and standalone project uses it. The translation mechanics differ from Python's requests, but the workflow is identical.
The captured curl, again
curl 'https://practice.scrapingcentral.com/api/products?page=1&category=mugs' \
-H 'accept: application/json' \
-H 'cookie: session=abc123' \
-H 'referer: https://practice.scrapingcentral.com/products' \
-H 'user-agent: Mozilla/5.0' \
--compressed
The Guzzle translation
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$client = new Client([
'base_uri' => 'https://practice.scrapingcentral.com',
'timeout' => 10,
]);
$res = $client->get('/api/products', [
'query' => ['page' => 1, 'category' => 'mugs'],
'headers' => [
'Accept' => 'application/json',
'User-Agent' => 'Mozilla/5.0',
'Referer' => 'https://practice.scrapingcentral.com/products',
'Cookie' => 'session=abc123',
],
]);
$data = json_decode($res->getBody()->getContents(), true);
echo count($data['products']) . " products\n";
What changed:
base_uriset once on the client → URLs in calls become relative paths. Cleaner.- Query string in the captured URL →
queryoption (associative array). - Each
-Hline → entry in theheadersoption. --compressedignored; Guzzle decompresses by default (whenAccept-Encodingis allowed).json_decode(..., true)returns an associative array (thetruearg flips it from object).
POST body translations
For a JSON POST:
curl 'https://.../api/auth/login' \
-X POST \
-H 'content-type: application/json' \
--data-raw '{"email":"...","password":"..."}'
PHP:
$res = $client->post('/api/auth/login', [
'json' => [
'email' => 'student@practice.scrapingcentral.com',
'password' => 'practice123',
],
]);
$token = json_decode($res->getBody(), true)['access_token'];
For a form-encoded POST (application/x-www-form-urlencoded):
$res = $client->post('/api/login-form', [
'form_params' => [
'email' => '...',
'password' => '...',
],
]);
For multipart (multipart/form-data, e.g. file upload):
$res = $client->post('/api/upload', [
'multipart' => [
['name' => 'file', 'contents' => fopen('/path/to/file', 'r')],
['name' => 'description', 'contents' => 'My file'],
],
]);
Mapping table, curl flags to Guzzle options
| curl | Guzzle option |
|---|---|
-H 'name: value' |
'headers' => ['Name' => 'value'] |
-b / -H 'cookie: ...' |
'cookies' => $jar (see below) or set Cookie header |
--data-raw 'a=b' (form) |
'form_params' => ['a' => 'b'] |
--data-raw '{"a":1}' (JSON) |
'json' => ['a' => 1] |
-X POST / -X PUT |
$client->post(...) / $client->put(...) |
-u user:pass |
'auth' => ['user', 'pass'] |
-L (follow redirects) |
'allow_redirects' => true (default) |
--max-time 10 |
'timeout' => 10 |
-k (insecure TLS) |
'verify' => false (don't ship this) |
--proxy http://... |
'proxy' => 'http://...' |
?a=1&b=2 (query string) |
'query' => ['a' => 1, 'b' => 2] |
CookieJar for session-aware scraping
When the scraper needs to log in, then make subsequent requests with the issued cookies, use Guzzle's CookieJar:
use GuzzleHttp\Client;
use GuzzleHttp\Cookie\CookieJar;
$jar = new CookieJar();
$client = new Client([
'base_uri' => 'https://practice.scrapingcentral.com',
'cookies' => $jar,
]);
// Login, sets cookies on $jar
$client->post('/api/auth/login', [
'json' => [
'email' => 'student@practice.scrapingcentral.com',
'password' => 'practice123',
],
]);
// Subsequent calls automatically include the cookies
$res = $client->get('/api/auth/me');
$me = json_decode($res->getBody(), true);
print_r($me);
The CookieJar is the PHP equivalent of Python's requests.Session().
Four most common Guzzle translation bugs
-
jsonvsbody. Use'json' => $arrayto send a JSON body.'body' => json_encode($array)works but forces you to setContent-Typemanually. Stick withjsonfor JSON endpoints. -
form_paramsvsmultipart.form_paramsproducesapplication/x-www-form-urlencoded(cheap, key=value).multipartproducesmultipart/form-data(heavier, supports files). Pick based on what the server expects. -
Hard-coded
Cookieheader vs CookieJar. Setting'Cookie' => 'session=abc'in headers works for one-shot calls but doesn't update across login redirects. Use CookieJar when there's a real session. -
Forgetting
->getContents()rewinds the stream. After$res->getBody()->getContents(), the stream pointer is at the end. SubsequentgetContents()returns an empty string. Cache the string in a variable.
A complete, production-shaped sample
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Cookie\CookieJar;
use GuzzleHttp\Exception\RequestException;
class Catalog108 {
private Client $client;
private CookieJar $jar;
public function __construct() {
$this->jar = new CookieJar();
$this->client = new Client([
'base_uri' => 'https://practice.scrapingcentral.com',
'timeout' => 10,
'cookies' => $this->jar,
'headers' => ['Accept' => 'application/json'],
]);
}
public function login(string $email, string $password): array {
$res = $this->client->post('/api/auth/login', [
'json' => ['email' => $email, 'password' => $password],
]);
return json_decode($res->getBody()->getContents(), true);
}
public function products(int $page = 1, int $perPage = 12): array {
$res = $this->client->get('/api/products', [
'query' => ['page' => $page, 'per_page' => $perPage],
]);
return json_decode($res->getBody()->getContents(), true);
}
}
$api = new Catalog108();
$api->login('student@practice.scrapingcentral.com', 'practice123');
$page = $api->products(1, 50);
echo count($page['products']) . " products on page 1\n";
This is the shape a real PHP scraper takes, covered in much more detail in lesson 3.12.
Hands-on lab
Composer-init a new project: composer require guzzlehttp/guzzle. Copy as cURL on /api/products, translate to PHP using the mapping table, run it, confirm you get JSON. Then write a for ($page = 1; $page <= 5; $page++) loop. You've just built the PHP equivalent of the Python scraper from lesson 3.6.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/api/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.