Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

1.10intermediate5 min read

Symfony HttpClient, Modern, Async-Ready Alternative

Symfony's HTTP client is the modern PHP alternative to Guzzle: chunked streaming, native HTTP/2, async by default, and tight integration with the rest of Symfony.

What you’ll learn

  • Install and configure `symfony/http-client`.
  • Send requests with the lazy/streaming model, requests are sent when you read the response.
  • Stream many requests concurrently via `stream()`.
  • Decide between Guzzle and Symfony HttpClient for your project.

Symfony's HttpClient is the newer kid on the PHP HTTP-library block. Its design philosophy differs from Guzzle in two important ways: requests are lazy (they don't actually fire until you read the response), and responses are streamed (you can process them chunk by chunk as bytes arrive). For scrapers fetching many pages concurrently or working with large payloads, this is meaningfully better.

Install

composer require symfony/http-client

It works fine outside a Symfony app, you don't need the full framework.

Two API levels

The library offers two interfaces:

  • HttpClientInterface, the low-level, async-first API (what we focus on here).
  • Psr18Client, a wrapper that implements PSR-18 (synchronous HTTP client standard).

Most direct scraper code uses HttpClientInterface. The PSR-18 wrapper exists for compatibility with libraries that expect a PSR-18 client.

A first request

<?php
require 'vendor/autoload.php';

use Symfony\Component\HttpClient\HttpClient;

$client = HttpClient::create([
  'base_uri' => 'https://practice.scrapingcentral.com',
  'timeout'  => 10,
  'headers'  => [
  'User-Agent'  => 'Mozilla/5.0 (compatible; my-scraper)',
  'Accept-Language' => 'en-US,en;q=0.9',
  ],
]);

$response = $client->request('GET', '/products');

echo $response->getStatusCode();
echo $response->getHeaders()['content-type'][0];
$body = $response->getContent();
echo substr($body, 0, 500);

Looks similar to Guzzle. But there's an important twist hiding underneath.

The lazy/streaming model

$client->request(...) does NOT immediately fire the request. It returns a ResponseInterface that represents a promise of a response. The actual HTTP exchange happens when you call getStatusCode(), getHeaders(), getContent(), or iterate via stream().

This is why Symfony HttpClient is sometimes called "async by default", you can fire off many requests, and they run concurrently in the background until you read them:

$responses = [];
for ($page = 1; $page <= 5; $page++) {
  // These do NOT block, they're sent in parallel under the hood
  $responses[] = $client->request('GET', "/products?page=$page");
}

// Now read them, they may have already completed
foreach ($responses as $i => $response) {
  echo "page " . ($i + 1) . ": " . $response->getStatusCode() . "\n";
}

That loop fires 5 parallel requests with zero explicit async syntax. Compare to Guzzle's Pool or Python's requests (which is fully synchronous), this is the cleanest async-fetch API in the PHP ecosystem.

Streaming responses with stream()

For true streaming, processing bytes as they arrive without buffering the full response, use $client->stream($responses):

$responses = [];
for ($page = 1; $page <= 10; $page++) {
  $responses[] = $client->request('GET', "/products?page=$page");
}

foreach ($client->stream($responses) as $response => $chunk) {
  if ($chunk->isFirst()) {
  echo "Started: " . $response->getInfo('url') . "\n";
  } elseif ($chunk->isLast()) {
  echo "Finished: " . $response->getInfo('url') . " ({$response->getStatusCode()})\n";
  } else {
  // Intermediate chunk, bytes are in $chunk->getContent()
  }
}

This processes responses as they finish, in completion order, not request order. A fast page returns before a slow one, and you handle each immediately. Ideal for bulk scraping where you want to start parsing as soon as each page is ready.

POST: form, JSON, body

// Form-encoded
$response = $client->request('POST', '/account/login', [
  'body' => ['username' => '...', 'password' => '...'],
]);

// JSON
$response = $client->request('POST', '/api/products', [
  'json' => ['name' => 'New', 'price' => 9.99],
]);

// Raw body
$response = $client->request('POST', '/api/raw', [
  'body'  => $rawBytes,
  'headers' => ['Content-Type' => 'application/octet-stream'],
]);

Same json shortcut as Guzzle: it serializes AND sets the header.

Query parameters

$response = $client->request('GET', '/products', [
  'query' => ['page' => 2, 'category' => 'kitchen'],
]);

Cookies: not built in (by default)

Symfony HttpClient doesn't ship with a CookieJar, surprising for anyone coming from Guzzle. You have two options:

  1. Manage cookies manually by setting the Cookie: header on subsequent requests.
  2. Wrap the client with CookieAwareHttpClient from a third-party package, or use the BrowserKit component (Lesson 1.19), which handles cookies natively.

In practice, for any flow that needs cookies, you often combine HttpClient with BrowserKit (HttpBrowser), which is covered in Lesson 1.19. For pure stateless API scraping, the lack of built-in cookie management is rarely a problem.

Error handling

By default, HttpClient does NOT throw on 4xx/5xx, getStatusCode() simply returns the code. To trigger an exception, call methods that require a successful response (e.g. getContent(true) will throw on a 4xx if its throw parameter is true, which is the default):

use Symfony\Contracts\HttpClient\Exception\HttpExceptionInterface;
use Symfony\Contracts\HttpClient\Exception\TransportExceptionInterface;

try {
  $content = $response->getContent();  // throws on 4xx/5xx by default
} catch (HttpExceptionInterface $e) {
  echo "HTTP error: " . $e->getResponse()->getStatusCode();
} catch (TransportExceptionInterface $e) {
  echo "Transport error: " . $e->getMessage();
}

To inspect without throwing:

$content = $response->getContent(false);  // never throw
$code  = $response->getStatusCode();

Auth, proxies, TLS

$client = HttpClient::create([
  'auth_basic'  => ['student', 'practice123'],  // Basic
  'auth_bearer'  => $token,  // Bearer
  'proxy'  => 'http://user:pass@proxy:8000',
  'verify_peer'  => true,
  'verify_host'  => true,
]);

Method-call-level overrides work too:

$response = $client->request('GET', '/admin', [
  'auth_bearer' => $adminToken,
]);

Guzzle vs Symfony HttpClient, pick one

Both are excellent. Use whichever your team prefers, but the rough distinction:

Guzzle Symfony HttpClient
Default mode Synchronous, async via Promises Async/lazy by default
Streaming Manual via PSR-7 stream First-class via stream()
Cookies Built-in CookieJar Not built-in (use BrowserKit)
Middleware Rich, well-known Decorator pattern (different feel)
HTTP/2 Possible via cURL First-class
Symfony ecosystem Compatible Native

For Symfony projects, use HttpClient, it integrates with the framework's profiler, logger, DI. For pure scripts or non-Symfony codebases, both work well; Guzzle has a longer track record and more answers on Stack Overflow.

Hands-on lab

The /challenges/static/pagination/cursor endpoint paginates via an opaque cursor in the response body. Fetch the first page, parse out the next-cursor value, request page 2 with that cursor, repeat until the response indicates no more pages. Use Symfony HttpClient's lazy async to prefetch the next page while you're parsing the current one, a streaming-style pipeline.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/static/pagination/cursor

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Symfony HttpClient, Modern, Async-Ready Alternative1 / 8

When does `$client->request('GET', $url)` actually send the HTTP request in Symfony HttpClient?

Score so far: 0 / 0