Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.45intermediate4 min read

Symfony Messenger Multi-Worker Setup for PHP

Scale Symfony Messenger from one worker to many. Worker management with systemd, supervisor, and Docker.

What you’ll learn

  • Run multiple workers per queue with appropriate sizing.
  • Configure systemd and supervisor for worker management.
  • Avoid the classic multi-worker pitfalls.

You've used Messenger in single-worker mode (§4.11). Scaling out is mostly about process management, one messenger:consume per worker, supervised, with the right transport configuration.

Multiple workers per transport

# 4 workers consuming the same queue
php bin/console messenger:consume scrape --limit=500 --time-limit=1800 &
php bin/console messenger:consume scrape --limit=500 --time-limit=1800 &
php bin/console messenger:consume scrape --limit=500 --time-limit=1800 &
php bin/console messenger:consume scrape --limit=500 --time-limit=1800 &

Each worker independently pulls messages. Total throughput scales linearly until the broker, target, or proxy pool becomes the bottleneck.

systemd unit per worker

Production: use systemd. One unit template instantiates N workers:

# /etc/systemd/system/messenger-scrape@.service
[Unit]
Description=Scrape Worker %i
After=network.target

[Service]
ExecStart=/usr/bin/php /var/www/app/bin/console messenger:consume scrape \
  --limit=500 --time-limit=1800 --memory-limit=256M
Restart=always
RestartSec=10
User=www-data

[Install]
WantedBy=multi-user.target

Then:

sudo systemctl enable --now messenger-scrape@1 messenger-scrape@2 messenger-scrape@3 messenger-scrape@4

Each @N gets its own process. systemd restarts them on exit, on OOM, on system reboot.

Supervisor alternative

If systemd isn't available (e.g. some Docker setups), supervisord works similarly:

# /etc/supervisor/conf.d/messenger.conf
[program:messenger-scrape]
command=php /var/www/app/bin/console messenger:consume scrape --limit=500 --time-limit=1800
user=www-data
numprocs=4
process_name=%(program_name)s_%(process_num)02d
autostart=true
autorestart=true
startsecs=5

numprocs=4 spawns four instances. Supervisor manages start/restart/stop uniformly.

Sizing workers per machine

How many workers per host? Depends on:

  • Memory. Each worker uses ~30–100 MB. Cap to fit available RAM with headroom.
  • CPU. I/O-bound workers (HTTP fetching) can run many per core. CPU-bound (parsing heavy HTML) need 1 per core.
  • DB connections. Each worker holds a Postgres connection. Don't exceed max_connections / replicas.
  • Outbound concurrency. Many workers × few proxies = proxy bottleneck.

Starting point: 4–8 workers per dedicated 4 vCPU machine. Tune up while watching CPU, RAM, and target success rate.

Multiple queues, different worker sizes

# config/packages/messenger.yaml
framework:
  messenger:
  transports:
  scrape_fetch:  '%env(MESSENGER_TRANSPORT_DSN)%/fetch'
  scrape_parse:  '%env(MESSENGER_TRANSPORT_DSN)%/parse'
  scrape_store:  '%env(MESSENGER_TRANSPORT_DSN)%/store'

  routing:
  App\Message\FetchPageMessage: scrape_fetch
  App\Message\ParseHtmlMessage: scrape_parse
  App\Message\StoreItemMessage: scrape_store

Run different worker counts per queue based on the work shape:

# 20 fetch workers (I/O-bound, light)
# 4 parse workers (CPU-bound)
# 2 store workers (DB-bound)

Prefetch and acknowledgment

Each worker can pull multiple messages ahead (prefetch). For Redis/AMQP transports:

transports:
  scrape:
  dsn: '%env(MESSENGER_TRANSPORT_DSN)%'
  options:
  consumer:
  prefetch_count: 10

Higher prefetch = lower broker latency but stale work distribution (one worker hoarding 100 messages while another sits idle). For variable-duration jobs, keep prefetch low (1-3); for uniform fast jobs, higher (10-30).

Graceful shutdown

Workers respect SIGTERM gracefully, finish current message, then exit. systemd/supervisor send SIGTERM on stop and SIGKILL after a timeout. Standard Linux signal handling.

framework:
  messenger:
  transports:
  scrape:
  dsn: '...'
  options:
  auto_setup: false  # don't auto-create queues

For deployments, the recipe is:

  1. Send SIGTERM to all workers.
  2. Wait for the longest-task time (e.g. 60s).
  3. Deploy new code.
  4. Start new workers.

Messages in flight complete; messages queued wait for the new workers.

Common pitfalls

1. Database connection exhaustion

20 workers × shared Postgres = 20 connections. Postgres max_connections=100 and pgbouncer not configured = errors at 80+. Either:

  • Use pgbouncer for connection pooling.
  • Lower worker count.
  • Use ephemeral connections (re-open per job, slower).

2. Doctrine UnitOfWork memory leak

Workers running 500 messages without em->clear() leak memory linearly:

public function __invoke(ScrapeMessage $msg): void
{
  // ... do work
  $this->em->flush();
  $this->em->clear();  // CRITICAL
}

Without clear(), you OOM by message 200.

3. Lock contention

Multiple workers on the same domain via Symfony Lock will all wait for the one worker holding the lock. Sometimes intentional (politeness). Sometimes you wanted parallelism within a domain, use multiple lock keys or remove the lock for that case.

4. Beat-like cron tasks

Don't run scheduled tasks inside Messenger workers if multiple workers consume the schedule transport. Either:

  • Run ONE worker for scheduler_default.
  • Use Symfony Lock inside the scheduled handler.

Multiple workers pulling from the same schedule transport will duplicate dispatches.

Monitoring multi-worker fleets

Track per-worker:

  • Messages handled per minute.
  • Average handler duration.
  • Failure rate.
  • Memory growth.

Symfony Profiler doesn't help in production (it's disabled). Push metrics to Prometheus or send to Datadog. Cover scraper-specific KPIs in §4.58.

Hands-on lab

In a Symfony scraping project:

  1. Configure 3 transports (fetch, parse, store) with the same Redis backend but distinct queues.
  2. Create systemd template units for each. Start 4 fetch, 2 parse, 1 store worker.
  3. Push 1000 fetch messages. Watch via systemctl status and redis-cli LLEN.

You're now running a multi-stage distributed scraping pipeline in PHP. Each worker fleet sized to its workload. Real production architecture, in under an hour of setup.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Symfony Messenger Multi-Worker Setup for PHP1 / 8

Why use a systemd template unit (messenger-scrape@.service) instead of N hardcoded units?

Score so far: 0 / 0