Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

1.11intermediate5 min read

PHP Sessions, Cookies, and Headers, Hands-On

Concretely: how PHP scrapers persist cookies across requests with Guzzle, Symfony HttpClient, and raw cURL, and how to inspect and override request headers.

What you’ll learn

  • Use Guzzle's CookieJar, SessionCookieJar, and FileCookieJar.
  • Add cookie support to Symfony HttpClient via header management or BrowserKit.
  • Persist cookies between cURL handles via COOKIEJAR/COOKIEFILE.
  • Inspect what your scraper actually sent, request URL, headers, body.

Sessions and cookies are the single area where Python requests is dramatically more ergonomic than any PHP HTTP client. This lesson is a concrete tour of how each PHP option handles them, so you can pick the right one for your project and avoid the painful mistakes.

The site we'll hit

/challenges/static/cookies/set-on-visit sets a visit_token cookie on first request and requires it on a subsequent endpoint. If you don't carry the cookie forward, the second call returns 401. Classic scraper-trap shape.

Guzzle: the easy path

<?php
require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Cookie\CookieJar;

$jar = new CookieJar();
$client = new Client([
  'base_uri' => 'https://practice.scrapingcentral.com',
  'cookies'  => $jar,  // also: 'cookies' => true for an auto-created jar
  'headers'  => ['User-Agent' => 'Mozilla/5.0 ...'],
]);

// 1. Visit page that sets the cookie
$client->get('/challenges/static/cookies/set-on-visit');

// 2. Subsequent request automatically carries the cookie
$response = $client->get('/challenges/static/cookies/set-on-visit/check');

// 3. Inspect what's in the jar
print_r($jar->toArray());

The CookieJar is the abstraction. Three flavors ship with Guzzle:

Class Behaviour
CookieJar In-memory; vanishes when script exits
SessionCookieJar Backed by PHP's $_SESSION, survives across page loads in a web app
FileCookieJar Backed by a file path, survives across CLI runs
use GuzzleHttp\Cookie\FileCookieJar;

$jar = new FileCookieJar('/tmp/cookies.json', true);  // storeSessionCookies=true
$client = new Client(['cookies' => $jar]);

storeSessionCookies=true is non-obvious: by default, session cookies (those without an explicit expiry) are dropped on save. Set it true if you need them to persist.

Symfony HttpClient: the manual path

Symfony HttpClient has no CookieJar built in. Two practical workarounds:

Option 1, manage Cookie: header manually

use Symfony\Component\HttpClient\HttpClient;

$client = HttpClient::create(['base_uri' => 'https://practice.scrapingcentral.com']);

$response = $client->request('GET', '/challenges/static/cookies/set-on-visit');

// Extract cookies from Set-Cookie response header
$setCookies = $response->getHeaders()['set-cookie'] ?? [];
$cookieJar = [];
foreach ($setCookies as $line) {
  if (preg_match('/^([^=]+)=([^;]+)/', $line, $m)) {
  $cookieJar[$m[1]] = $m[2];
  }
}

// Build a Cookie: header for the next request
$cookieHeader = implode('; ', array_map(
  fn($k, $v) => "$k=$v",
  array_keys($cookieJar),
  array_values($cookieJar)
));

$response = $client->request('GET', '/challenges/static/cookies/set-on-visit/check', [
  'headers' => ['Cookie' => $cookieHeader],
]);

Tedious, but you control it completely.

Option 2, use BrowserKit / HttpBrowser

Far cleaner. HttpBrowser wraps an HttpClient and adds cookie + history tracking automatically:

use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;

$browser = new HttpBrowser(HttpClient::create());
$browser->request('GET', 'https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit');
$browser->request('GET', 'https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit/check');

print_r($browser->getCookieJar()->all());

Browser-style API in 5 lines. We cover BrowserKit in detail in Lesson 1.19.

Raw cURL: cookie file shuffling

$cookieFile = tempnam(sys_get_temp_dir(), 'scr');

function curl_session_get(string $url, string $cookieFile): string {
  $ch = curl_init($url);
  curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_COOKIEJAR  => $cookieFile,  // write Set-Cookie
  CURLOPT_COOKIEFILE  => $cookieFile,  // read on next req
  CURLOPT_USERAGENT  => 'Mozilla/5.0',
  ]);
  $body = curl_exec($ch);
  curl_close($ch);
  return $body;
}

curl_session_get('https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit', $cookieFile);
$check = curl_session_get('https://practice.scrapingcentral.com/challenges/static/cookies/set-on-visit/check', $cookieFile);

Or even better: reuse one cURL handle for the entire flow. Re-use the handle, change the URL with curl_setopt($ch, CURLOPT_URL, $newUrl), and cookies are automatically remembered without touching the filesystem.

Inspecting what your scraper actually sent

Things diverge silently. Always know how to confirm what went on the wire.

Guzzle: history middleware

use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;

$history = [];
$stack = HandlerStack::create();
$stack->push(Middleware::history($history));

$client = new Client(['handler' => $stack]);
$client->get('/products?page=2');

foreach ($history as $entry) {
  $req = $entry['request'];
  echo $req->getMethod() . " " . $req->getUri() . "\n";
  foreach ($req->getHeaders() as $name => $values) {
  echo "  $name: " . implode(', ', $values) . "\n";
  }
}

Every request and response flows through the history middleware. Indispensable for debugging.

Symfony HttpClient: getInfo()

$response = $client->request('GET', '/products');
$response->getContent();  // forces the request to actually fire

print_r($response->getInfo());
// includes: url, http_code, total_time, response_headers, etc.

cURL: curl_getinfo (already covered in 1.8)

Default headers, set once, apply everywhere

Guzzle:

$client = new Client([
  'headers' => [
  'User-Agent'  => 'Mozilla/5.0 ...',
  'Accept-Language' => 'en-US,en;q=0.9',
  ],
]);

Symfony HttpClient:

$client = HttpClient::create([
  'headers' => [
  'User-Agent'  => 'Mozilla/5.0 ...',
  'Accept-Language' => 'en-US,en;q=0.9',
  ],
]);

cURL: you set them per handle via CURLOPT_HTTPHEADER. To share, wrap the setup in a helper function.

Per-request override

All three clients let you override headers/cookies on individual requests without modifying the client default, useful when one request needs different auth, language, or referer.

Pick your default

For most PHP scraping work, the right default looks like this:

  • Guzzle if you want the most ergonomic, batteries-included experience and don't care about being in the Symfony ecosystem.
  • Symfony HttpClient + HttpBrowser if you're already in Symfony or want clean async + browser-style cookie handling.
  • Raw cURL only when neither is installable.

Hands-on lab

Visit /challenges/static/cookies/set-on-visit twice without sharing cookies, confirm the second request fails or gives a "no token" response. Then implement the same flow with Guzzle's CookieJar, confirm the second request now succeeds. Finally, inspect $jar->toArray() to see exactly which cookies the server set and what their attributes are.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/static/cookies/set-on-visit

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

PHP Sessions, Cookies, and Headers, Hands-On1 / 8

Which Guzzle CookieJar variant survives across separate CLI script runs?

Score so far: 0 / 0