Dockerizing PHP / Symfony Scrapers
PHP scrapers ship in Docker too. Symfony-specific patterns, FrankenPHP, opcache, and the differences from Python images.
What you’ll learn
- Write a production Dockerfile for a Symfony scraping app.
- Configure opcache and JIT for performance.
- Decide between php-fpm + nginx, FrankenPHP, and CLI-only containers.
PHP in Docker is a slightly different beast from Python. There's no virtualenv concept; you have extensions, FPM vs CLI, opcache configuration, and the modern alternative, FrankenPHP, to consider.
Image flavor matters
| Image | Use case | Notes |
|---|---|---|
php:8.3-cli |
CLI scrapers, workers, Symfony Console | Smaller, no web stack |
php:8.3-fpm |
Behind nginx for HTTP endpoints | Pair with nginx container |
php:8.3-apache |
Self-contained HTTP server | Convenient for small services |
dunglas/frankenphp |
Modern alternative; single binary, HTTP/2/3, worker mode | Fast, simpler, recommended for new projects |
For pure scraping workers, php:8.3-cli is the right pick, no web server needed. For Symfony apps that also serve an admin UI / API, FrankenPHP gives you both.
A production CLI Dockerfile
FROM composer:2 AS composer
FROM php:8.3-cli AS builder
COPY --from=composer /usr/bin/composer /usr/local/bin/composer
RUN apt-get update && apt-get install -y --no-install-recommends \
git unzip libpq-dev libzip-dev libicu-dev libxml2-dev \
&& rm -rf /var/lib/apt/lists/*
RUN docker-php-ext-install pdo_pgsql zip intl opcache
WORKDIR /app
COPY composer.json composer.lock symfony.lock ./
RUN composer install --no-dev --no-scripts --no-interaction --prefer-dist --no-progress \
&& composer clear-cache
COPY . .
RUN composer dump-autoload --optimize --no-dev --classmap-authoritative
# Runtime stage
FROM php:8.3-cli
RUN apt-get update && apt-get install -y --no-install-recommends \
libpq5 libzip4 libicu72 libxml2 ca-certificates tini \
&& rm -rf /var/lib/apt/lists/*
COPY --from=builder /usr/local/lib/php/extensions /usr/local/lib/php/extensions
COPY --from=builder /usr/local/etc/php/conf.d /usr/local/etc/php/conf.d
COPY --from=builder /app /app
RUN useradd -m -u 1000 scraper
USER scraper
WORKDIR /app
ENV APP_ENV=prod
ENV APP_DEBUG=0
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["php", "bin/console", "app:scrape"]
Key points:
- Composer in a separate stage. Don't ship Composer in the runtime image.
docker-php-ext-installfor native extensions. Lighter than PECL when possible.- Dev dependencies excluded (
--no-dev). Symfony Profiler, debug toolbar, fixtures, none of that ships. composer dump-autoload --optimizegenerates a classmap, ~50% faster autoloading at runtime.APP_ENV=proddisables debug, enables caching.
opcache for performance
PHP without opcache re-parses every file on every script invocation. With it, parsed opcodes are cached. For CLI scripts that exit quickly, opcache offers less; for long-running workers (Symfony Messenger), it's a 2–10x speedup.
opcache.ini in the image:
opcache.enable=1
opcache.enable_cli=1
opcache.memory_consumption=256
opcache.max_accelerated_files=20000
opcache.validate_timestamps=0
opcache.jit=tracing
opcache.jit_buffer_size=128M
enable_cli=1is required for opcache in CLI mode (off by default).validate_timestamps=0means PHP doesn't recheck files for changes (safe in immutable containers; redeploy = new image = restart).jit=tracingenables PHP 8's JIT, meaningful gains for CPU-heavy work like HTML parsing.
Symfony cache pre-warming
For prod images, pre-build the Symfony cache so the first request doesn't pay the cost:
RUN php bin/console cache:clear --env=prod --no-debug
RUN php bin/console cache:warmup --env=prod --no-debug
For Messenger workers running CLI, cache:warmup still helps, the bootstrap reads from the pre-warmed cache.
FrankenPHP (modern alternative)
FrankenPHP is a Caddy-based PHP server that ships as a single binary. For a scraping app that also serves a Symfony API or admin UI, it replaces nginx+fpm entirely:
FROM dunglas/frankenphp:1-php8.3 AS base
RUN install-php-extensions pdo_pgsql intl opcache zip
WORKDIR /app
COPY --from=composer:2 /usr/bin/composer /usr/local/bin/composer
COPY composer.json composer.lock symfony.lock ./
RUN composer install --no-dev --no-scripts --prefer-dist
COPY . .
RUN composer dump-autoload --optimize --no-dev --classmap-authoritative
ENV APP_ENV=prod APP_DEBUG=0
ENV FRANKENPHP_CONFIG="worker /app/public/index.php"
# Single process serves HTTP and runs your app
CMD ["frankenphp", "run", "--config", "/etc/caddy/Caddyfile"]
worker mode keeps the PHP process alive across requests (similar to Roadrunner), significantly faster than fpm cold-starts.
Running Symfony workers in Docker
For Messenger workers, the CMD changes:
CMD ["php", "bin/console", "messenger:consume", "async", "--time-limit=3600", "--memory-limit=256M"]
--time-limit and --memory-limit make the worker exit cleanly after N seconds / MB, Docker restarts it, fresh memory, no leaks. This is the standard pattern for long-running PHP workers.
Multi-arch builds
docker buildx build --platform linux/amd64,linux/arm64 -t myreg/scraper:1.2.0 --push .
PHP extensions sometimes have arch-specific quirks. Test both architectures if you deploy to ARM (Apple Silicon, AWS Graviton, Hetzner ARM VMs).
Comparing to the Python image
The PHP scraper image is roughly the same size and shape as Python's. Differences:
- PHP needs a small fleet of extensions; Python ships most batteries.
- Composer's autoload optimization matters; pip has no equivalent.
- opcache + JIT vs CPython's relatively dumb interpreter; PHP often wins on pure CPU work.
- FrankenPHP is a real "PHP application server" with no Python analogue, simpler than gunicorn + nginx.
What to try
Take a Symfony Messenger worker that scrapes Catalog108. Build the CLI Dockerfile above. Then:
- Time
php bin/console list(cache warm vs cold). - Toggle opcache on/off, re-time.
- Run
messenger:consumein the container with--memory-limit=256M. Check that it self-exits when the limit is hit and Docker restarts it.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.