Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.62intermediate4 min read

Dockerizing PHP / Symfony Scrapers

PHP scrapers ship in Docker too. Symfony-specific patterns, FrankenPHP, opcache, and the differences from Python images.

What you’ll learn

  • Write a production Dockerfile for a Symfony scraping app.
  • Configure opcache and JIT for performance.
  • Decide between php-fpm + nginx, FrankenPHP, and CLI-only containers.

PHP in Docker is a slightly different beast from Python. There's no virtualenv concept; you have extensions, FPM vs CLI, opcache configuration, and the modern alternative, FrankenPHP, to consider.

Image flavor matters

Image Use case Notes
php:8.3-cli CLI scrapers, workers, Symfony Console Smaller, no web stack
php:8.3-fpm Behind nginx for HTTP endpoints Pair with nginx container
php:8.3-apache Self-contained HTTP server Convenient for small services
dunglas/frankenphp Modern alternative; single binary, HTTP/2/3, worker mode Fast, simpler, recommended for new projects

For pure scraping workers, php:8.3-cli is the right pick, no web server needed. For Symfony apps that also serve an admin UI / API, FrankenPHP gives you both.

A production CLI Dockerfile

FROM composer:2 AS composer

FROM php:8.3-cli AS builder
COPY --from=composer /usr/bin/composer /usr/local/bin/composer

RUN apt-get update && apt-get install -y --no-install-recommends \
  git unzip libpq-dev libzip-dev libicu-dev libxml2-dev \
  && rm -rf /var/lib/apt/lists/*

RUN docker-php-ext-install pdo_pgsql zip intl opcache

WORKDIR /app
COPY composer.json composer.lock symfony.lock ./
RUN composer install --no-dev --no-scripts --no-interaction --prefer-dist --no-progress \
  && composer clear-cache

COPY . .
RUN composer dump-autoload --optimize --no-dev --classmap-authoritative

# Runtime stage
FROM php:8.3-cli
RUN apt-get update && apt-get install -y --no-install-recommends \
  libpq5 libzip4 libicu72 libxml2 ca-certificates tini \
  && rm -rf /var/lib/apt/lists/*

COPY --from=builder /usr/local/lib/php/extensions /usr/local/lib/php/extensions
COPY --from=builder /usr/local/etc/php/conf.d /usr/local/etc/php/conf.d
COPY --from=builder /app /app

RUN useradd -m -u 1000 scraper
USER scraper
WORKDIR /app

ENV APP_ENV=prod
ENV APP_DEBUG=0

ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["php", "bin/console", "app:scrape"]

Key points:

  • Composer in a separate stage. Don't ship Composer in the runtime image.
  • docker-php-ext-install for native extensions. Lighter than PECL when possible.
  • Dev dependencies excluded (--no-dev). Symfony Profiler, debug toolbar, fixtures, none of that ships.
  • composer dump-autoload --optimize generates a classmap, ~50% faster autoloading at runtime.
  • APP_ENV=prod disables debug, enables caching.

opcache for performance

PHP without opcache re-parses every file on every script invocation. With it, parsed opcodes are cached. For CLI scripts that exit quickly, opcache offers less; for long-running workers (Symfony Messenger), it's a 2–10x speedup.

opcache.ini in the image:

opcache.enable=1
opcache.enable_cli=1
opcache.memory_consumption=256
opcache.max_accelerated_files=20000
opcache.validate_timestamps=0
opcache.jit=tracing
opcache.jit_buffer_size=128M
  • enable_cli=1 is required for opcache in CLI mode (off by default).
  • validate_timestamps=0 means PHP doesn't recheck files for changes (safe in immutable containers; redeploy = new image = restart).
  • jit=tracing enables PHP 8's JIT, meaningful gains for CPU-heavy work like HTML parsing.

Symfony cache pre-warming

For prod images, pre-build the Symfony cache so the first request doesn't pay the cost:

RUN php bin/console cache:clear --env=prod --no-debug
RUN php bin/console cache:warmup --env=prod --no-debug

For Messenger workers running CLI, cache:warmup still helps, the bootstrap reads from the pre-warmed cache.

FrankenPHP (modern alternative)

FrankenPHP is a Caddy-based PHP server that ships as a single binary. For a scraping app that also serves a Symfony API or admin UI, it replaces nginx+fpm entirely:

FROM dunglas/frankenphp:1-php8.3 AS base

RUN install-php-extensions pdo_pgsql intl opcache zip

WORKDIR /app
COPY --from=composer:2 /usr/bin/composer /usr/local/bin/composer

COPY composer.json composer.lock symfony.lock ./
RUN composer install --no-dev --no-scripts --prefer-dist

COPY . .
RUN composer dump-autoload --optimize --no-dev --classmap-authoritative

ENV APP_ENV=prod APP_DEBUG=0
ENV FRANKENPHP_CONFIG="worker /app/public/index.php"

# Single process serves HTTP and runs your app
CMD ["frankenphp", "run", "--config", "/etc/caddy/Caddyfile"]

worker mode keeps the PHP process alive across requests (similar to Roadrunner), significantly faster than fpm cold-starts.

Running Symfony workers in Docker

For Messenger workers, the CMD changes:

CMD ["php", "bin/console", "messenger:consume", "async", "--time-limit=3600", "--memory-limit=256M"]

--time-limit and --memory-limit make the worker exit cleanly after N seconds / MB, Docker restarts it, fresh memory, no leaks. This is the standard pattern for long-running PHP workers.

Multi-arch builds

docker buildx build --platform linux/amd64,linux/arm64 -t myreg/scraper:1.2.0 --push .

PHP extensions sometimes have arch-specific quirks. Test both architectures if you deploy to ARM (Apple Silicon, AWS Graviton, Hetzner ARM VMs).

Comparing to the Python image

The PHP scraper image is roughly the same size and shape as Python's. Differences:

  • PHP needs a small fleet of extensions; Python ships most batteries.
  • Composer's autoload optimization matters; pip has no equivalent.
  • opcache + JIT vs CPython's relatively dumb interpreter; PHP often wins on pure CPU work.
  • FrankenPHP is a real "PHP application server" with no Python analogue, simpler than gunicorn + nginx.

What to try

Take a Symfony Messenger worker that scrapes Catalog108. Build the CLI Dockerfile above. Then:

  1. Time php bin/console list (cache warm vs cold).
  2. Toggle opcache on/off, re-time.
  3. Run messenger:consume in the container with --memory-limit=256M. Check that it self-exits when the limit is hit and Docker restarts it.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Dockerizing PHP / Symfony Scrapers1 / 8

For a pure Symfony Messenger worker (no HTTP), which base image is best?

Score so far: 0 / 0