Building a Clean Python API Client (Class Design)
Stop writing inline requests calls. Wrap a target in a class, base URL, session, auth, retries, typed methods. The shape every senior Python scraper uses.
What you’ll learn
- Structure an API client as a small class with a Session, base URL, and typed methods.
- Centralise auth and headers so scraping code stays readable.
- Add timeouts, JSON shortcuts, and basic error handling.
- Recognise when to escalate to httpx, async, or a real SDK.
Inline requests.get(...) calls are fine for a one-off. The moment you have more than two endpoints, you want a client class. It centralizes the base URL, the session, the auth, the headers, and the error handling, so your scraping logic stays focused on data.
The shape
from __future__ import annotations
import requests
from typing import Any
class Catalog108Client:
BASE_URL = "https://practice.scrapingcentral.com"
def __init__(self, timeout: float = 10.0):
self.session = requests.Session()
self.session.headers.update({"Accept": "application/json"})
self.timeout = timeout
self.token: str | None = None
def _request(self, method: str, path: str, **kwargs) -> Any:
url = f"{self.BASE_URL}{path}"
kwargs.setdefault("timeout", self.timeout)
if self.token:
kwargs.setdefault("headers", {})
kwargs["headers"].setdefault("Authorization", f"Bearer {self.token}")
r = self.session.request(method, url, **kwargs)
r.raise_for_status()
return r.json() if r.content else None
def login(self, email: str, password: str) -> None:
data = self._request("POST", "/api/auth/login",
json={"email": email, "password": password})
self.token = data["access_token"]
def me(self) -> dict:
return self._request("GET", "/api/auth/me")
def products(self, page: int = 1, per_page: int = 12, category: str | None = None) -> dict:
params = {"page": page, "per_page": per_page}
if category:
params["category"] = category
return self._request("GET", "/api/products", params=params)
def product(self, product_id: int) -> dict:
return self._request("GET", f"/api/products/{product_id}")
def reviews(self, product_id: int) -> list[dict]:
return self._request("GET", f"/api/products/{product_id}/reviews")
Why each piece is there
BASE_URLas a class constant, change once, the whole client moves.requests.Session(), connection reuse (TCP keepalive) and cookie persistence. 5–10x faster than callingrequests.getrepeatedly.session.headers.update(...), defaults that apply to every call. No more remembering to passAccept.self.token, stored after login, automatically added to every authenticated call via_request._request(method, path, **kwargs), single choke point for cross-cutting concerns (timeout, auth header, error handling). Easy to add retries here later (lesson 3.10).r.raise_for_status(), turns 4xx/5xx into exceptions. Your calling code stops checking status codes.- Typed methods (
products,product,reviews), readable, discoverable, IDE-completable.
Usage
client = Catalog108Client()
client.login("student@practice.scrapingcentral.com", "practice123")
print(client.me())
page1 = client.products(page=1, per_page=50)
for p in page1["products"]:
print(p["id"], p["name"], p["price"])
reviews = client.reviews(product_id=1)
for r in reviews:
print(r["rating"], r["author"], r["text"])
Read that out loud. It's almost prose. Compare to inline:
# old style: brittle, repetitive
r = requests.post("https://practice.scrapingcentral.com/api/auth/login",
json={"email": "...", "password": "..."})
token = r.json()["access_token"]
r = requests.get("https://practice.scrapingcentral.com/api/auth/me",
headers={"Authorization": f"Bearer {token}"})
me = r.json()
# ... 50 more lines, all repeating the URL and the header
Adding retries, the next layer
The _request choke point makes retries trivial:
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
class Catalog108Client:
def __init__(self, timeout: float = 10.0):
self.session = requests.Session()
retry = Retry(
total=5,
backoff_factor=1.0, # 1s, 2s, 4s, 8s, 16s
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["GET", "POST"],
)
adapter = HTTPAdapter(max_retries=retry)
self.session.mount("https://", adapter)
self.session.mount("http://", adapter)
self.session.headers.update({"Accept": "application/json"})
self.timeout = timeout
self.token = None
# _request and methods unchanged
Now the client retries 5xx and 429 responses with exponential backoff automatically. Lesson 3.10 dives deeper.
Iterators for pagination
A nice ergonomic touch, yield pages, let callers loop without bookkeeping:
def iter_products(self, per_page: int = 50, category: str | None = None):
page = 1
while True:
data = self.products(page=page, per_page=per_page, category=category)
for p in data["products"]:
yield p
if page * per_page >= data["pagination"]["total"]:
break
page += 1
# Usage
for product in client.iter_products(per_page=50, category="mugs"):
print(product["name"])
The caller writes one for loop. The client does pagination.
Error handling
Three error tiers worth raising explicitly:
class APIError(Exception):
pass
class AuthError(APIError):
pass
class RateLimited(APIError):
pass
def _request(self, method, path, **kwargs):
# ... same setup ...
r = self.session.request(method, url, **kwargs)
if r.status_code == 401:
raise AuthError(f"401 at {path}; token may be expired")
if r.status_code == 429:
raise RateLimited(f"429 at {path}; retry-after={r.headers.get('Retry-After')}")
if not r.ok:
raise APIError(f"{r.status_code} at {path}: {r.text[:200]}")
return r.json() if r.content else None
Now callers try: ... except AuthError: client.login(...) and the recovery logic is clear.
When to scale up
The hand-rolled class is right for: one target, one or two scrapers, less than 50 endpoints.
Outgrow it when:
- Async needed → switch to
httpx.AsyncClient(lesson 3.11). - Many endpoints (50+) → OpenAPI codegen (
openapi-python-client). - Distributable package → wrap in a real SDK (the lesson-3.14/3.15 pattern, in PHP, but the Python equivalents are mature).
Hands-on lab
Build the Catalog108Client above end-to-end. Add a search(query: str) method that hits /api/products?search=.... Add error handling that distinguishes 401 from 429 from other failures. Drive your existing scrapers through this client instead of inline requests.get. You should immediately notice your scraping code shrinks by half and reads twice as well.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/api/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.