Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

3.12intermediate5 min read

Building a Clean PHP API Client (PSR-18, Guzzle)

The PHP version of the senior client pattern. Class-based, base URI, middleware, JSON helpers, and PSR-18 compatible so you can swap the transport later.

What you’ll learn

  • Structure a PHP API client around a Guzzle Client and helper methods.
  • Centralise auth, base URI, default headers, and timeouts.
  • Use Guzzle middleware for retry, logging, and auth refresh.
  • Understand PSR-18 and why a typed `ClientInterface` is worth it.

The PHP analogue of the Python client lesson. Same goals, base URL, session, auth, retries, typed methods, different idioms. By the end you'll have a reusable class that any of your scrapers can consume.

The shape

<?php
namespace App\Catalog108;

use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\Cookie\CookieJar;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\RequestInterface;

class Catalog108Client {
  public const BASE_URI = 'https://practice.scrapingcentral.com';

  private Client $http;
  private CookieJar $cookies;
  private ?string $token = null;

  public function __construct(float $timeout = 10.0, int $maxRetries = 5) {
  $this->cookies = new CookieJar();
  $stack = HandlerStack::create();
  $stack->push(Middleware::retry(
  $this->retryDecider($maxRetries),
  $this->retryDelay()
  ));

  $this->http = new Client([
  'base_uri'  => self::BASE_URI,
  'timeout'  => $timeout,
  'handler'  => $stack,
  'cookies'  => $this->cookies,
  'headers'  => ['Accept' => 'application/json'],
  'http_errors' => true,
  ]);
  }

  public function login(string $email, string $password): array {
  $data = $this->request('POST', '/api/auth/login', [
  'json' => compact('email', 'password'),
  ]);
  $this->token = $data['access_token'];
  return $data;
  }

  public function me(): array {
  return $this->request('GET', '/api/auth/me');
  }

  public function products(int $page = 1, int $perPage = 12, ?string $category = null): array {
  $query = ['page' => $page, 'per_page' => $perPage];
  if ($category) $query['category'] = $category;
  return $this->request('GET', '/api/products', ['query' => $query]);
  }

  public function product(int $id): array {
  return $this->request('GET', "/api/products/{$id}");
  }

  public function reviews(int $id): array {
  return $this->request('GET', "/api/products/{$id}/reviews");
  }

  public function search(string $q, int $page = 1): array {
  return $this->request('GET', '/api/products', [
  'query' => ['search' => $q, 'page' => $page],
  ]);
  }

  private function request(string $method, string $path, array $opts = []): array {
  if ($this->token) {
  $opts['headers']['Authorization'] = "Bearer {$this->token}";
  }
  $res = $this->http->request($method, $path, $opts);
  $body = $res->getBody()->getContents();
  return $body === '' ? [] : json_decode($body, true, 512, JSON_THROW_ON_ERROR);
  }

  private function retryDecider(int $max): callable {
  return function (
  int $retries,
  RequestInterface $req,
  ?ResponseInterface $res = null,
  ?\Throwable $e = null
  ) use ($max): bool {
  if ($retries >= $max) return false;
  if ($e) return true;
  if ($res) {
  $code = $res->getStatusCode();
  return $code === 429 || $code >= 500;
  }
  return false;
  };
  }

  private function retryDelay(): callable {
  return function (int $retries, ?ResponseInterface $res = null): int {
  if ($res && $res->hasHeader('Retry-After')) {
  $ra = $res->getHeaderLine('Retry-After');
  if (is_numeric($ra)) return ((int)$ra) * 1000;
  }
  $base = min(30, pow(2, $retries));
  return (int)($base * 1000 * mt_rand(0, 1000) / 1000);
  };
  }
}

Why each piece is there

  • BASE_URI constant, single source of truth for the host.
  • Guzzle Client with base_uri, relative paths become '/api/products'. Clean.
  • CookieJar, preserves cookies across login → subsequent calls.
  • HandlerStack + Middleware::retry, retries with exponential backoff + Retry-After.
  • http_errors => true, 4xx/5xx throw GuzzleHttp\Exception\RequestException.
  • request() choke point, centralizes auth header injection.
  • Typed public methods, products(), product($id), reviews($id), the surface other code touches.

Usage

$api = new Catalog108Client();
$api->login('student@practice.scrapingcentral.com', 'practice123');
print_r($api->me());

$page = $api->products(page: 1, perPage: 50);
foreach ($page['products'] as $p) {
  echo "{$p['id']} {$p['name']} {$p['price']}\n";
}

$reviews = $api->reviews(1);
foreach ($reviews as $r) {
  echo "{$r['rating']}★ {$r['author']}\n";
}

Pagination iterator (generators in PHP)

PHP supports generators with yield, same idea as Python:

public function iterProducts(int $perPage = 50, ?string $category = null): \Generator {
  $page = 1;
  while (true) {
  $data = $this->products($page, $perPage, $category);
  foreach ($data['products'] as $p) {
  yield $p;
  }
  if ($page * $perPage >= $data['pagination']['total']) break;
  $page++;
  }
}

// usage
foreach ($api->iterProducts(perPage: 50, category: 'mugs') as $product) {
  echo $product['name'] . "\n";
}

PSR-18 and why it matters

PSR-18 is a PHP-FIG standard defining Psr\Http\Client\ClientInterface:

interface ClientInterface {
  public function sendRequest(RequestInterface $request): ResponseInterface;
}

Any compatible client (Guzzle 7+, Symfony HttpClient via adapter, Buzz, etc.) can be passed to anything expecting a PSR-18 client. Your SDK depends on the interface, not on Guzzle specifically.

For a published SDK (lesson 3.14), this matters: consumers might already have Guzzle wired up, or they might be on Symfony HttpClient. PSR-18 lets them inject whichever they have.

Modify the client to accept any PSR-18:

use Psr\Http\Client\ClientInterface as Psr18Client;
use Psr\Http\Message\RequestFactoryInterface;
use Psr\Http\Message\StreamFactoryInterface;

class Catalog108Sdk {
  public function __construct(
  private Psr18Client $http,
  private RequestFactoryInterface $requestFactory,
  private StreamFactoryInterface $streamFactory,
  public string $baseUri = 'https://practice.scrapingcentral.com',
  ) {}

  public function products(int $page = 1): array {
  $request = $this->requestFactory
  ->createRequest('GET', "{$this->baseUri}/api/products?page={$page}")
  ->withHeader('Accept', 'application/json');
  $res = $this->http->sendRequest($request);
  return json_decode($res->getBody()->getContents(), true);
  }
}

Now consumers wire it however they like:

$sdk = new Catalog108Sdk(
  http: new GuzzleHttp\Client(),
  requestFactory: new GuzzleHttp\Psr7\HttpFactory(),
  streamFactory: new GuzzleHttp\Psr7\HttpFactory(),
);

For your own internal scrapers, you don't need PSR-18, just use Guzzle directly. For a public SDK, do.

Error handling

use GuzzleHttp\Exception\ClientException;
use GuzzleHttp\Exception\ServerException;
use GuzzleHttp\Exception\ConnectException;

try {
  $page = $api->products(99999);
} catch (ClientException $e) {
  // 4xx, your request was wrong (404, 401, 422)
  $code = $e->getResponse()->getStatusCode();
  echo "Client error {$code}\n";
} catch (ServerException $e) {
  // 5xx, the server failed (after retries exhausted)
  echo "Server error\n";
} catch (ConnectException $e) {
  // Network / DNS / timeout
  echo "Network error: " . $e->getMessage() . "\n";
}

Hands-on lab

Build the Catalog108Client class above in a fresh Composer project. Login, fetch products, fetch a specific product's reviews. Add the iterProducts generator and use it to count all products in the catalog. Confirm that introducing a --max-retries=0 variant fails on a temporarily-down server while the default --max-retries=5 succeeds, the retry middleware doing its job.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /api/products

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Building a Clean PHP API Client (PSR-18, Guzzle)1 / 8

What does PSR-18 define?

Score so far: 0 / 0