Symfony Panther in Production
Real browser automation from PHP. When Panther is the right tool, how to run it reliably, and where it falls short of Playwright.
What you’ll learn
- Drive Chrome from a Symfony app using Panther.
- Wait for JS-rendered content before extracting.
- Manage Chrome lifecycle in long-running workers.
Panther is Symfony's WebDriver bridge, it drives real Chrome or Firefox from PHP. The API mirrors Symfony's BrowserKit, so transitioning from HttpBrowser to PantherTestCase is mostly mechanical.
For scraping, Panther fills the same role as scrapy-playwright: handle the pages that need JavaScript without leaving the framework.
Install
composer require symfony/panther
Panther needs a browser binary. Locally, Chrome or Chromium. In Docker, install Chromium.
A minimal scrape
<?php
// src/Service/LivePriceScraper.php
namespace App\Service;
use Symfony\Component\Panther\Client;
class LivePriceScraper
{
public function scrape(string $url): array
{
$client = Client::createChromeClient(arguments: [
'--headless=new',
'--no-sandbox',
'--disable-gpu',
'--window-size=1280,800',
]);
try {
$client->request('GET', $url);
$client->waitForVisibility('.price-feed-item');
$crawler = $client->getCrawler();
$items = [];
foreach ($crawler->filter('.price-feed-item') as $node) {
$sub = new \Symfony\Component\DomCrawler\Crawler($node);
$items[] = [
'sku' => $sub->filter('[data-sku]')->attr('data-sku'),
'price' => $sub->filter('.price')->text(),
];
}
return $items;
} finally {
$client->quit();
}
}
}
Key patterns:
waitForVisibility(selector), block until the element is rendered. Use this anywhere you'dwait_for_selectorin Playwright.getCrawler(), returns a DomCrawler instance over the current DOM. You can re-use your existing parsing code unchanged.quit()in a finally, releases Chrome. Forgetting this leaks browser processes.
Available wait helpers
| Method | Use |
|---|---|
waitFor(selector) |
Wait until selector exists in the DOM |
waitForVisibility(selector) |
Wait until selector is visible (not hidden) |
waitForInvisibility(selector) |
Wait until selector disappears |
waitForAttributeToContain |
Wait for an attribute to contain a value |
waitForFunction |
Run JS until it returns truthy |
Set a longer default timeout:
$client->getWebDriver()->manage()->timeouts()->implicitlyWait(WebDriverTimeouts::SECONDS_30);
Running Panther in Docker
FROM php:8.3-cli
RUN apt-get update && apt-get install -y \
chromium chromium-driver \
&& rm -rf /var/lib/apt/lists/*
ENV PANTHER_CHROME_BINARY=/usr/bin/chromium
ENV PANTHER_NO_SANDBOX=1
Two environment variables matter:
PANTHER_CHROME_BINARY, path to Chromium.PANTHER_NO_SANDBOX=1, required in containers (Chrome's sandbox needs privileges that containers usually lack).
For headful debugging on Linux, run xvfb-run:
xvfb-run php bin/console scrape:live-prices
Driving Panther from Messenger workers
A Panther-based handler for the live deals endpoint:
#[AsMessageHandler]
final class ScrapeLivePricesHandler
{
public function __construct(private LivePriceScraper $scraper) {}
public function __invoke(ScrapeLivePricesMessage $msg): void
{
$items = $this->scraper->scrape('https://practice.scrapingcentral.com/deals/live');
foreach ($items as $item) {
// persist
}
}
}
Each invocation spins up Chrome, scrapes, quits. Chrome is heavy (~200 MB RAM), so don't run too many workers per machine. Tune messenger:consume --limit=N low enough that workers restart frequently, Chrome leaks tend to accumulate over hundreds of sessions.
Reusing Chrome across messages
For higher throughput, reuse the Client across multiple handler invocations:
class LongLivedScraper
{
private ?Client $client = null;
public function client(): Client
{
if (!$this->client) {
$this->client = Client::createChromeClient(...);
}
return $this->client;
}
public function scrape(string $url): array
{
$this->client()->request('GET', $url);
...
}
}
Register as a service with shared: true (Symfony default). The same Chrome instance serves many handler runs. Add a periodic quit() to avoid memory drift.
Where Panther falls behind Playwright
Honest comparison:
| Feature | Panther | Playwright |
|---|---|---|
| Browser support | Chrome, Firefox via WebDriver | Chromium, Firefox, WebKit native |
| Network interception | Limited | Full request/response mocking |
| Multi-tab handling | Possible but awkward | First-class |
| Trace/debug tools | Basic | Playwright Inspector, traces, video |
| Async / parallel page | Limited | Built-in |
| Anti-detection | Older WebDriver, Cloudflare detects it more often | Newer, less detected (with some hardening) |
For scrapers that need cutting-edge anti-bot evasion, Playwright (via a Python service called from PHP) often wins. For 80% of "wait for JS, extract data" cases, Panther is fine.
A pragmatic hybrid: Symfony does orchestration, persistence, and HTML scraping; a small Python sidecar with Playwright handles the JS-heavy pages. The two communicate via Messenger or a small HTTP API.
Selenium under the hood
Panther speaks WebDriver, the same protocol Selenium uses. Behind Panther is the facebook/webdriver (now php-webdriver/webdriver) library. If you outgrow Panther's API surface, you can use the WebDriver library directly:
$driver = $client->getWebDriver();
$driver->executeScript('return navigator.userAgent;');
Hands-on lab
Against /deals/live on Catalog108 (which streams price updates via WebSocket):
- Write a Panther scraper that loads the page, waits for
.price-feed-item, captures the first 20 prices. - Wrap it in a Messenger handler.
- Schedule it every minute via Symfony Scheduler.
- Monitor memory across 100 runs, does it stay flat, or does it grow? If it grows, the Chrome reuse pattern is leaking.
Browsers are heavy. Cycle them. Panther in production looks like "use sparingly, restart often."
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/deals/liveQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.