Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.20intermediate5 min read

When to Use Which PHP Framework

Symfony, Roach, Goutte/HttpBrowser, Laravel, raw Guzzle, a decision matrix for which to reach for, with honest trade-offs.

What you’ll learn

  • Map project shape to the right PHP scraping framework.
  • Recognize the anti-patterns of choosing the wrong layer.
  • Combine frameworks when the project is hybrid.

Five candidates, real differences. This is a decision matrix and honest comparison, not a "best framework" article, because there isn't one.

The candidates

Tool Role Scope
Raw Guzzle / HttpClient HTTP client only Smallest
HttpBrowser (BrowserKit) HTTP + cookies + forms Static scraper
Roach PHP Full scraping framework Scrapy-style
Symfony (components) Full framework Whole application
Laravel + Roach Full framework Whole application (Laravel-flavoured)

The decision matrix

Project shape Right tool
One-off script, fetch + parse, run twice Raw HttpClient + DomCrawler
Static scraper with login forms + cookies HttpBrowser
Many spiders, classical crawl shapes, primarily a scraper Roach
Existing Symfony app, scraping is one feature Symfony components
Existing Laravel app, scraping is one feature Roach via roach-php/laravel
Async-heavy, many concurrent fetches, low-level control ReactPHP or Amp (see §4.23, §4.24)
Must run JS Panther or Playwright (regardless of framework above)

Anti-patterns

"I'll use Symfony because it's the standard"

If you don't need Symfony's infrastructure (DI, config, events, scheduler, messenger, ORM, web), don't pay its boilerplate cost. A 200-line scraper that runs once a week doesn't need a Symfony app. HttpClient + DomCrawler directly is fine.

"I'll build a custom framework"

Most "I'll just write my own scraper framework" projects end up reinventing 60% of Roach, badly. If you have multiple spiders, sequential parsing, pipelines, and rate limits, use Roach. The time you save in NIH is worth more than the autonomy lost.

"I'll use Roach inside Symfony for the scraping bits"

This creates two parallel architectures. Roach has its own engine, middleware, pipelines. Symfony has Messenger, middleware, services. Mixing them means maintaining two mental models for one codebase. Pick one.

The exception: if the project starts in Symfony and you're adding a single, fully encapsulated scraper component, Roach can be that component, invoked from a Symfony Console command, treated as a library. Just don't try to weave Roach's lifecycle into Messenger.

"I'll use Laravel because we already use it"

Reasonable, but be honest: Laravel's HTTP client is fine, its Eloquent is fine, its queues are fine. But for serious scraping infrastructure, Symfony's components are usually better-designed (Messenger > Laravel queues for complex routing; Symfony Lock > Laravel Cache::lock for transparent stores). Use Laravel for the app; reach for Symfony components or Roach for the scraping core. They co-exist via Composer.

Picking by team

Sometimes the answer is "what does the team know?" That matters more than the technical comparison:

  • A Laravel-fluent team writing one scraper: Roach + Laravel. Productive immediately.
  • A Symfony-fluent team: Symfony components, all the way down.
  • A PHP team with no framework allegiance: Roach for greenfield, Symfony if there's existing infrastructure.
  • A polyglot team: maybe Python + Scrapy is the right answer; this lesson lives in §4.1.

Honest about constraints beats elegant in theory.

Combining frameworks intentionally

Some hybrid combinations work well:

Symfony for orchestration, Roach for scraping

A Symfony app with Messenger, Doctrine, API Platform. One MessageHandler kicks off a Roach spider as a library call. Roach handles the spider lifecycle; Symfony handles everything else.

#[AsMessageHandler]
class RunSpiderHandler
{
  public function __invoke(RunSpiderMessage $msg): void
  {
  $items = Roach::collectSpider($msg->spiderClass);
  foreach ($items as $item) {
  $this->em->persist(new Product(...$item));
  }
  $this->em->flush();
  }
}

The trick: Roach's call is synchronous and atomic within the handler. Don't try to stream Roach output back through Messenger.

Symfony + Panther/Playwright

Pure Symfony for HTML scraping; Panther (or a Python Playwright sidecar) for JS pages. Connect via HTTP or Messenger. Two languages, two tools, one orchestration plane.

Laravel + raw HttpClient

If the scraper is light (one cron, a few requests, simple parsing), don't bring in Roach. Laravel's HTTP facade + a DomCrawler library is enough. Reach for Roach when the project has many spiders or pipeline complexity.

The "no framework" path

For genuinely small projects, the no-framework path is:

$client = HttpClient::create([
  'headers' => ['User-Agent' => 'MyScraper/1.0'],
  'max_redirects' => 5,
]);
$resp = $client->request('GET', $url);
$crawler = new Crawler($resp->getContent());
$items = [];
foreach ($crawler->filter('.product-card') as $card) {
  $items[] = [
  'title' => (new Crawler($card))->filter('h3')->text(''),
  ];
}
file_put_contents('items.json', json_encode($items));

That's 12 lines for a complete scrape. Plenty of real-world projects don't need more.

Migration paths

Projects grow. The migration paths to be aware of:

  • Goutte → HttpBrowser: drop-in, two-hour migration.
  • HttpBrowser → Roach: significant refactor. Convert your linear script into spider + pipeline + middleware shape.
  • Roach → Symfony components: significant refactor, but reverses easily because Roach's primitives map cleanly onto Symfony's.

Plan for growth, but don't over-engineer for it on day one.

Hands-on lab

For a project you're working on (or imagine), answer:

  1. How many distinct sources/spiders do you need?
  2. Does the scraper stand alone or live inside a bigger app?
  3. Is JS rendering required for any source?
  4. What's the team's framework fluency?

Map those answers to the matrix above. Most projects land on one of three: raw HttpClient (small, scripty), Roach (multi-spider, focused), Symfony components (app-integrated). Pick once, commit, ship.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

When to Use Which PHP Framework1 / 8

A team needs a single Console command that scrapes one site once a day and persists to a Doctrine entity inside an existing Symfony app. Best choice?

Score so far: 0 / 0