PHP: Symfony HttpClient Async Streaming
Often the easiest way to get PHP concurrency: Symfony HttpClient's `stream()` API. No fibers, no promises, just sync-looking code that multiplexes underneath.
What you’ll learn
- Use HttpClient->stream() to consume responses as they arrive.
- Stream large response bodies chunk-by-chunk.
- Decide when stream() beats ReactPHP/Amp.
For most PHP scraping needs, you don't need ReactPHP or Amp. Symfony HttpClient's stream() API achieves curl-multi concurrency without any async/await syntax. Sync-looking code; concurrent execution underneath.
The pattern
use Symfony\Component\HttpClient\HttpClient;
$client = HttpClient::create([
'max_host_connections' => 10,
'timeout' => 15,
'headers' => ['User-Agent' => 'Scraper/1.0'],
]);
$urls = [];
for ($i = 1; $i <= 20; $i++) {
$urls[] = "https://practice.scrapingcentral.com/api/products?page=$i";
}
// 1. Fire all requests (non-blocking, returns lazy responses immediately)
$responses = array_map(
fn(string $url) => $client->request('GET', $url),
$urls,
);
// 2. Stream chunks as they arrive across ALL responses, in completion order
foreach ($client->stream($responses) as $response => $chunk) {
if ($chunk->isFirst()) {
// Headers arrived
if ($response->getStatusCode() !== 200) continue;
}
if ($chunk->isLast()) {
$data = json_decode($response->getContent(), true);
echo $response->getInfo('url') . ': ' . count($data) . " items\n";
}
}
$client->request() doesn't block, the response is "lazy." Iterating $client->stream(...) advances all responses in lockstep, yielding chunks as they arrive on the wire.
Under the hood: curl multi-handle. Single process, multiplexed I/O, no fibers.
When stream() wins
For these patterns, stream() is the simplest tool:
- Batch fetch, gather all responses. Fire N requests, process when complete. One loop.
- Streaming large responses. Get the first chunks of a multi-MB JSON without loading the whole body into memory.
- First-N-complete patterns. Process responses in the order they finish, not the order they were issued.
Stream a single large response
$response = $client->request('GET', $url);
$buffer = '';
foreach ($client->stream($response) as $chunk) {
$buffer .= $chunk->getContent();
// Once we have a complete JSON object boundary, decode and process
//, useful for newline-delimited JSON streams
}
For Content-Type: application/x-ndjson or similar streams, you can decode line-by-line as data arrives:
$leftover = '';
foreach ($client->stream($response) as $chunk) {
$data = $leftover . $chunk->getContent();
$lines = explode("\n", $data);
$leftover = array_pop($lines); // keep partial line for next chunk
foreach ($lines as $line) {
if ($line === '') continue;
$obj = json_decode($line, true);
process($obj);
}
}
Memory stays flat; you process as data arrives. The same pattern works for streaming CSV.
Per-host connection limits
$client = HttpClient::create([
'max_host_connections' => 8,
]);
Caps parallel TCP connections to a single host at 8. Combined with concurrent request() calls, you get parallelism without overwhelming the target.
Retries
RetryableHttpClient (or the retry_failed config in framework.yaml) handles transient failures automatically:
use Symfony\Component\HttpClient\RetryableHttpClient;
$client = new RetryableHttpClient(HttpClient::create(), maxRetries: 3);
Retries integrate with stream() transparently, failed responses retry without manual intervention.
Concurrent inserts to a queue
A common pattern: scrape with stream(), push results to a Messenger queue or a database:
foreach ($client->stream($responses) as $response => $chunk) {
if ($chunk->isLast()) {
$data = json_decode($response->getContent(), true);
$bus->dispatch(new StoreScrapedDataMessage($data));
}
}
The scrape is fast (concurrent fetches). The expensive parts (DB writes, enrichment) live in workers consuming the queue. Decoupled, scalable.
stream() vs ReactPHP vs Amp
| Style | Code shape | Best for |
|---|---|---|
| HttpClient + stream() | Sync-looking, no async syntax | "Fire N, gather all," batch scraping, simple flows |
| ReactPHP | Promise chains | Long-lived event-driven processes, JS-familiar teams |
| Amp v3 | Fiber-based, sync-looking | Complex async flows, cancellation, mixed I/O |
For 80% of PHP scraping concurrency needs, stream() is enough. ReactPHP/Amp are reach-for-when-you-need-more.
Limits
- No timer/scheduler primitives. stream() is just HTTP concurrency. If you need event-loop-based scheduling, ReactPHP/Amp.
- No long-lived server pattern. stream() runs to completion. A WebSocket client or pub/sub consumer wants ReactPHP/Amp.
- No fiber-style nested awaits. Sequential dependencies between requests in a complex flow are awkward with stream(), chain them sequentially or move to Amp.
A complete production pattern
class ConcurrentScraper
{
public function __construct(
private readonly HttpClientInterface $client,
private readonly LoggerInterface $logger,
) {}
public function scrape(array $urls, int $maxConcurrent = 10): iterable
{
$responses = [];
foreach ($urls as $url) {
$responses[] = $this->client->request('GET', $url);
if (count($responses) >= $maxConcurrent) {
yield from $this->drain($responses, partial: true);
}
}
yield from $this->drain($responses);
}
private function drain(array &$responses, bool $partial = false): iterable
{
foreach ($this->client->stream($responses) as $response => $chunk) {
if ($chunk->isLast()) {
try {
if ($response->getStatusCode() === 200) {
yield $response->getInfo('url') => $response->toArray();
}
} catch (\Throwable $e) {
$this->logger->warning('fetch failed', ['url' => $response->getInfo('url'), 'error' => $e->getMessage()]);
}
$key = array_search($response, $responses, true);
if ($key !== false) unset($responses[$key]);
if ($partial && count($responses) < $maxConcurrent) return;
}
}
}
}
Bounded concurrency, error handling, lazy result yield. About 30 lines, no async/await syntax, fully concurrent.
Hands-on lab
Build a concurrent scraper using stream() against /api/products:
- Fetch 50 pages concurrently with
max_host_connections=10. - Process JSON responses as they complete (don't wait for all).
- Write each item to a JSONL file.
Compare runtime to a sequential getContent() loop. The stream() version should be 5–10x faster for typical API latency, the simplest PHP concurrency story available.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.