PHP Sessions, Cookies, and Headers, Hands-On
Concretely: how PHP scrapers persist cookies across requests with Guzzle, Symfony HttpClient, and raw cURL, and how to inspect and override request headers.
What you’ll learn
- Use Guzzle's CookieJar, SessionCookieJar, and FileCookieJar.
- Add cookie support to Symfony HttpClient via header management or BrowserKit.
- Persist cookies between cURL handles via COOKIEJAR/COOKIEFILE.
- Inspect what your scraper actually sent, request URL, headers, body.
Sessions and cookies are the single area where Python requests is dramatically more ergonomic than any PHP HTTP client. This lesson is a concrete tour of how each PHP option handles them, so you can pick the right one for your project and avoid the painful mistakes.
The site we'll hit
/challenges/static/cookies/set-on-visit sets a visit_token cookie on first request and requires it on a subsequent endpoint. If you don't carry the cookie forward, the second call returns 401. Classic scraper-trap shape.
Guzzle: the easy path
<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Cookie\CookieJar;
$jar = new CookieJar();
$client = new Client([
'base_uri' => 'https://practice.scrapingcentral.com',
'cookies' => $jar, // also: 'cookies' => true for an auto-created jar
'headers' => ['User-Agent' => 'Mozilla/5.0 ...'],
]);
// 1. Visit page that sets the cookie
$client->get('/challenges/static/cookies/set-on-visit');
// 2. Subsequent request automatically carries the cookie
$response = $client->get('/challenges/static/cookies/set-on-visit/check');
// 3. Inspect what's in the jar
print_r($jar->toArray());
The CookieJar is the abstraction. Three flavors ship with Guzzle:
| Class | Behaviour |
|---|---|
CookieJar |
In-memory; vanishes when script exits |
SessionCookieJar |
Backed by PHP's $_SESSION, survives across page loads in a web app |
FileCookieJar |
Backed by a file path, survives across CLI runs |
use GuzzleHttp\Cookie\FileCookieJar;
$jar = new FileCookieJar('/tmp/cookies.json', true); // storeSessionCookies=true
$client = new Client(['cookies' => $jar]);
storeSessionCookies=true is non-obvious: by default, session cookies (those without an explicit expiry) are dropped on save. Set it true if you need them to persist.
Symfony HttpClient: the manual path
Symfony HttpClient has no CookieJar built in. Two practical workarounds:
Option 1, manage Cookie: header manually
use Symfony\Component\HttpClient\HttpClient;
$client = HttpClient::create(['base_uri' => 'https://practice.scrapingcentral.com']);
$response = $client->request('GET', '/challenges/static/cookies/set-on-visit');
// Extract cookies from Set-Cookie response header
$setCookies = $response->getHeaders()['set-cookie'] ?? [];
$cookieJar = [];
foreach ($setCookies as $line) {
if (preg_match('/^([^=]+)=([^;]+)/', $line, $m)) {
$cookieJar[$m[1]] = $m[2];
}
}
// Build a Cookie: header for the next request
$cookieHeader = implode('; ', array_map(
fn($k, $v) => "$k=$v",
array_keys($cookieJar),
array_values($cookieJar)
));
$response = $client->request('GET', '/challenges/static/cookies/set-on-visit/check', [
'headers' => ['Cookie' => $cookieHeader],
]);
Tedious, but you control it completely.
Option 2, use BrowserKit / HttpBrowser
Far cleaner. HttpBrowser wraps an HttpClient and adds cookie + history tracking automatically:
use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;
$browser = new HttpBrowser(HttpClient::create());
$browser->request('GET', 'https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit');
$browser->request('GET', 'https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit/check');
print_r($browser->getCookieJar()->all());
Browser-style API in 5 lines. We cover BrowserKit in detail in Lesson 1.19.
Raw cURL: cookie file shuffling
$cookieFile = tempnam(sys_get_temp_dir(), 'scr');
function curl_session_get(string $url, string $cookieFile): string {
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_COOKIEJAR => $cookieFile, // write Set-Cookie
CURLOPT_COOKIEFILE => $cookieFile, // read on next req
CURLOPT_USERAGENT => 'Mozilla/5.0',
]);
$body = curl_exec($ch);
curl_close($ch);
return $body;
}
curl_session_get('https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit', $cookieFile);
$check = curl_session_get('https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit/check', $cookieFile);
Or even better: reuse one cURL handle for the entire flow. Re-use the handle, change the URL with curl_setopt($ch, CURLOPT_URL, $newUrl), and cookies are automatically remembered without touching the filesystem.
Inspecting what your scraper actually sent
Things diverge silently. Always know how to confirm what went on the wire.
Guzzle: history middleware
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
$history = [];
$stack = HandlerStack::create();
$stack->push(Middleware::history($history));
$client = new Client(['handler' => $stack]);
$client->get('/products?page=2');
foreach ($history as $entry) {
$req = $entry['request'];
echo $req->getMethod() . " " . $req->getUri() . "\n";
foreach ($req->getHeaders() as $name => $values) {
echo " $name: " . implode(', ', $values) . "\n";
}
}
Every request and response flows through the history middleware. Indispensable for debugging.
Symfony HttpClient: getInfo()
$response = $client->request('GET', '/products');
$response->getContent(); // forces the request to actually fire
print_r($response->getInfo());
// includes: url, http_code, total_time, response_headers, etc.
cURL: curl_getinfo (already covered in 1.8)
Default headers, set once, apply everywhere
Guzzle:
$client = new Client([
'headers' => [
'User-Agent' => 'Mozilla/5.0 ...',
'Accept-Language' => 'en-US,en;q=0.9',
],
]);
Symfony HttpClient:
$client = HttpClient::create([
'headers' => [
'User-Agent' => 'Mozilla/5.0 ...',
'Accept-Language' => 'en-US,en;q=0.9',
],
]);
cURL: you set them per handle via CURLOPT_HTTPHEADER. To share, wrap the setup in a helper function.
Per-request override
All three clients let you override headers/cookies on individual requests without modifying the client default, useful when one request needs different auth, language, or referer.
Pick your default
For most PHP scraping work, the right default looks like this:
- Guzzle if you want the most ergonomic, batteries-included experience and don't care about being in the Symfony ecosystem.
- Symfony HttpClient + HttpBrowser if you're already in Symfony or want clean async + browser-style cookie handling.
- Raw cURL only when neither is installable.
Hands-on lab
Visit /challenges/static/cookies/set-on-visit twice without sharing cookies, confirm the second request fails or gives a "no token" response. Then implement the same flow with Guzzle's CookieJar, confirm the second request now succeeds. Finally, inspect $jar->toArray() to see exactly which cookies the server set and what their attributes are.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/challenges/static/cookies/set-on-visitQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.