Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.17intermediate4 min read

Symfony Serializer for Multi-Format Output

One serializer, many formats. Turn scraped DTOs into JSON, CSV, XML, YAML, or custom formats, without writing per-format code.

What you’ll learn

  • Serialize scraped DTOs to JSON, CSV, XML with the same Symfony Serializer.
  • Use serialization groups to expose different views per consumer.
  • Add a custom normalizer for non-trivial fields.

A scraper's output is data. Different consumers want different formats, JSON for APIs, CSV for spreadsheets, XML for legacy systems. Symfony Serializer handles all of these from one DTO definition.

Install

composer require symfony/serializer-pack

The pack pulls in normalizers, encoders, attribute support, and CSV/XML formats.

A typed DTO

<?php
namespace App\Dto;

use Symfony\Component\Serializer\Attribute\Groups;

final class ProductDto
{
  public function __construct(
  #[Groups(['public', 'internal'])]
  public string $url,

  #[Groups(['public', 'internal'])]
  public string $title,

  #[Groups(['public', 'internal'])]
  public ?float $price = null,

  #[Groups(['public', 'internal'])]
  public ?string $sku = null,

  #[Groups(['internal'])]
  public ?array $rawData = null,

  #[Groups(['public', 'internal'])]
  public \DateTimeImmutable $scrapedAt = new \DateTimeImmutable(),
  ) {}
}

Groups label which properties belong to which view. We'll use them shortly.

Serialize to any format

use Symfony\Component\Serializer\SerializerInterface;

public function __construct(
  private readonly SerializerInterface $serializer,
) {}

public function emit(array $products, string $format): string
{
  return $this->serializer->serialize(
  $products,
  $format,  // 'json', 'csv', 'xml', 'yaml'
  ['groups' => ['public']],
  );
}

Same code, four outputs:

$json = $this->emit($products, 'json');
$csv  = $this->emit($products, 'csv');
$xml  = $this->emit($products, 'xml');
$yaml = $this->emit($products, 'yaml');

No format-specific glue. The serializer dispatches to the right encoder.

Group-based views

['groups' => ['public']] only serializes properties tagged with public. The raw_data field tagged only internal is omitted. Same DTO, different visibility per consumer:

// Public API response
$serializer->serialize($p, 'json', ['groups' => ['public']]);

// Internal debug dump
$serializer->serialize($p, 'json', ['groups' => ['internal']]);

This avoids the dual-DTO pattern (ProductPublicDto, ProductInternalDto) for what's really one entity viewed differently.

CSV output details

By default, Symfony's CsvEncoder produces:

url,title,price,sku,scrapedAt
https://...,Keyboard,49.99,SKU-1,2026-05-12T10:00:00+00:00

Customize:

$csv = $this->serializer->serialize(
  $products, 'csv',
  [
  'csv_headers' => ['url', 'title', 'price'],  // reorder, drop columns
  'csv_delimiter' => ';',  // European style
  'csv_escape_formulas' => true,  // protect against =SUM()-injection
  ],
);

csv_escape_formulas is critical for any CSV opened in Excel, without it, cells starting with =, +, -, @ are interpreted as formulas, a known attack surface.

XML output

<response>
  <item>
  <url>https://...</url>
  <title>Keyboard</title>
  <price>49.99</price>
  </item>
</response>

Customize root element, format, attributes:

$xml = $this->serializer->serialize($products, 'xml', [
  'xml_root_node_name' => 'products',
  'xml_format_output' => true,  // pretty-print
]);

Custom normalizers

Some fields need transformation that property accessors can't express. Implement NormalizerInterface:

namespace App\Serializer;

use Symfony\Component\Serializer\Normalizer\NormalizerInterface;

class MoneyNormalizer implements NormalizerInterface
{
  public function normalize($object, ?string $format = null, array $context = []): array
  {
  return [
  'amount' => $object->amount(),
  'currency' => $object->currency(),
  'formatted' => $object->formatted(),
  ];
  }

  public function supportsNormalization($data, ?string $format = null): bool
  {
  return $data instanceof Money;
  }

  public function getSupportedTypes(?string $format): array
  {
  return [Money::class => true];
  }
}

Register as a service (Symfony auto-tags any class implementing the interface) and the serializer picks it up automatically.

Streaming large outputs

For 100k products, building one giant string in memory is wasteful. The serializer can stream:

foreach ($repository->iterateAll() as $product) {
  $row = $this->serializer->normalize($product, 'csv', ['groups' => ['public']]);
  fputcsv($fp, $row);
}

For really large exports, encode item-by-item rather than serializing the whole array. The framework supports this with encode() and normalize() separately.

Deserializing

Reverse direction: parse a JSON/CSV/XML payload back into DTOs.

$dto = $this->serializer->deserialize($jsonString, ProductDto::class, 'json');

Useful when scraping APIs that return structured payloads, instead of decoding manually, the serializer maps fields onto a typed DTO. Combined with Symfony Validator, you get parse-and-validate as one step.

When the serializer is overkill

For a single-format internal pipeline (always JSON), json_encode is one line. The serializer pays off when you have:

  • Multiple output formats.
  • Multiple views (groups) of the same data.
  • Complex object graphs (entities with relations).
  • DTOs you want to round-trip (serialize then deserialize for replay).

For a one-shot CSV dump of a flat array, fputcsv is fine.

Hands-on lab

Add a scrape:export Console command that:

  1. Loads N products from Postgres.
  2. Accepts --format=json|csv|xml.
  3. Accepts --groups=public|internal.
  4. Outputs to stdout (so you can pipe it).

Test: php bin/console scrape:export --format=csv --groups=public | head -20. The single command serves three formats. That's the leverage Symfony Serializer buys you.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Symfony Serializer for Multi-Format Output1 / 8

How does Symfony Serializer support multiple output formats from one call site?

Score so far: 0 / 0