Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.12intermediate5 min read

Symfony Scheduler, Cron-Style Scraping Inside Your App

Replace external cron with Symfony Scheduler. Recurring messages, missed-run handling, and the right way to schedule scrapers inside a Symfony app.

What you’ll learn

  • Define recurring schedules using ScheduleProviderInterface.
  • Dispatch scheduled messages through Messenger.
  • Handle missed runs and overlapping schedules.

External cron works. But it's also operationally annoying: separate config to deploy, separate logs, no integration with your application's container, awkward to test. Symfony Scheduler (introduced in 6.4) keeps scheduling inside the app, dispatched via Messenger.

The model

Scheduler is a transport for Messenger. You define schedules; the transport emits messages at the configured times; your handlers process them like any other Messenger message.

A minimal schedule

<?php
// src/Schedule/MainSchedule.php
namespace App\Schedule;

use App\Message\ScrapeProductsMessage;
use App\Message\ScrapeReviewsMessage;
use Symfony\Component\Scheduler\Attribute\AsSchedule;
use Symfony\Component\Scheduler\RecurringMessage;
use Symfony\Component\Scheduler\Schedule;
use Symfony\Component\Scheduler\ScheduleProviderInterface;

#[AsSchedule('default')]
final class MainSchedule implements ScheduleProviderInterface
{
  public function getSchedule(): Schedule
  {
  return (new Schedule())
  ->add(
  RecurringMessage::cron('0 2 * * *', new ScrapeProductsMessage()),
  RecurringMessage::cron('0 */6 * * *', new ScrapeReviewsMessage()),
  RecurringMessage::every('5 minutes', new HealthCheckMessage()),
  );
  }
}

cron('0 2 * * *'...), daily at 2 AM. every('5 minutes'...), interval.

Running the scheduler

php bin/console messenger:consume scheduler_default

That's it. One worker process named after the schedule. Multiple schedules become multiple worker queues.

In production, run it as a systemd service:

# /etc/systemd/system/catalog108-scheduler.service
[Unit]
Description=Catalog108 Scheduler

[Service]
ExecStart=/usr/bin/php /var/www/catalog108/bin/console messenger:consume scheduler_default
Restart=always
RestartSec=10s
User=www-data

[Install]
WantedBy=multi-user.target

RecurringMessage trigger types

Trigger Example
cron(expression, message) cron('*/10 * * * *'...), every 10 minutes
every('5 minutes', message) Simpler interval syntax
Custom TriggerInterface Arbitrary "next fire time" logic

Custom triggers are useful when "every Monday after a holiday" or "every quarter end" matters more than cron syntax can express.

Missed runs

If the scheduler process is down at the scheduled time, what happens?

By default: the missed run is skipped. When the process comes back, it waits for the next scheduled time.

To recover missed runs, configure the schedule with a checkpoint store:

return (new Schedule())
  ->stateful($this->cache)
  ->add(
  RecurringMessage::cron('0 2 * * *', new ScrapeProductsMessage()),
  );

stateful($cache) persists the last-run timestamp to the cache pool. On startup, if the scheduler sees it missed a run, it fires it immediately. Use a persistent cache pool (Redis, filesystem), not array cache, otherwise restarts lose state.

Be careful: catch-up runs against an hour-old window might be the wrong behavior. Sometimes "skip what we missed" is correct.

Overlapping schedules

If a scrape takes 2 hours but is scheduled every hour, the second instance starts while the first is still running. Three patterns:

  1. Let them overlap. Each gets its own worker. Resource consumption doubles, but the throughput goal is met.

  2. Lock with Symfony Lock. The handler acquires a Lock keyed by message type. Subsequent runs while one holds the lock simply exit. See §4.16.

public function __invoke(ScrapeProductsMessage $msg): void
{
  $lock = $this->lockFactory->createLock('scrape-products');
  if (!$lock->acquire()) {
  $this->logger->info('previous run still active; skipping');
  return;
  }
  try {
  $this->doScrape();
  } finally {
  $lock->release();
  }
}
  1. Use a single worker process. With one process consuming scheduler_default, only one message runs at a time. The transport queues subsequent fires; they execute serially. Simplest pattern for "must not overlap" semantics.

Testing a schedule

public function testSchedule(): void
{
  $schedule = self::getContainer()->get(MainSchedule::class);
  $messages = iterator_to_array($schedule->getSchedule()->getRecurringMessages());
  $this->assertCount(3, $messages);
}

You can also fast-forward the clock in tests using Symfony's clock component:

use Symfony\Component\Clock\MockClock;

$clock = new MockClock('2026-05-12 01:59:00');
// configure scheduler with $clock
$clock->modify('+5 minutes');
// scheduler should have fired daily message

Cron vs Scheduler, when to use which

Use cron when... Use Symfony Scheduler when...
The scraper is a self-contained script not part of an app The scraper is part of a Symfony app
The team prefers OS-level scheduling You want schedules in code, versioned, deployed
There's only one scheduled job You have many schedules with shared services
You need to run as different OS users per job All jobs share the app's runtime

Both are valid. Symfony Scheduler is increasingly the default because it keeps the entire system in one place, schedule definitions, handlers, logs, monitoring all in the app.

When schedules grow complex

For more than ~10 schedules with dependencies, conditional skips, retries, look at:

  • Airflow / Prefect / Dagster, proper workflow orchestrators with DAGs.
  • Temporal, durable workflow execution.

For most scraping shops with 5–20 recurring jobs, Symfony Scheduler is plenty.

Hands-on lab

In your Symfony scraping project:

  1. Define a schedule with three recurring messages, a hourly products scrape, daily review refresh, every-15-minutes health check.
  2. Run messenger:consume scheduler_default in one terminal.
  3. Watch the messages dispatch at the right times.
  4. Add stateful($cache) and verify that stopping the worker for 5 minutes and restarting fires a catch-up run.

Schedule-in-code is one of those quality-of-life upgrades you don't appreciate until you've felt the pain of crontab -e deployments.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Symfony Scheduler, Cron-Style Scraping Inside Your App1 / 8

How does Symfony Scheduler dispatch a recurring message?

Score so far: 0 / 0