Handling Rate Limiting in API Scraping

Learn how to detect, handle, and work around API rate limits using backoff strategies, concurrent throttling, and proxy rotation.

Rate limiting is how APIs protect themselves from abuse. When you exceed the allowed request frequency, the server returns a 429 Too Many Requests status. Handling this gracefully is essential for reliable scraping.

Detecting Rate Limits

Look for these signals in API responses:

import requests

response = requests.get("https://api.example.com/data", timeout=15)

# Check status code
if response.status_code == 429:
    retry_after = response.headers.get("Retry-After", 60)
    print(f"Rate limited. Retry after {retry_after} seconds.")

# Many APIs include rate limit headers
print(f"Remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Reset at: {response.headers.get('X-RateLimit-Reset')}")

Exponential Backoff

The standard approach, wait progressively longer after each failure:

import requests
import time

def fetch_with_backoff(url, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url, timeout=15)

        if response.status_code == 200:
            return response.json()

        if response.status_code == 429:
            wait_time = min(2 ** attempt, 60)  # 1s, 2s, 4s, 8s, 16s...
            retry_after = response.headers.get("Retry-After")
            if retry_after:
                wait_time = int(retry_after)
            print(f"Rate limited. Waiting {wait_time}s (attempt {attempt + 1})")
            time.sleep(wait_time)
        else:
            response.raise_for_status()

    raise Exception(f"Failed after {max_retries} retries")

data = fetch_with_backoff("https://api.example.com/items")

Using the `tenacity` Library

For production scrapers, tenacity provides clean retry decorators:

import requests
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_result

def is_rate_limited(response):
    return response.status_code == 429

@retry(
    retry=retry_if_result(is_rate_limited),
    wait=wait_exponential(multiplier=1, min=2, max=60),
    stop=stop_after_attempt(6),
)
def fetch(url):
    return requests.get(url, timeout=15)

response = fetch("https://api.example.com/data")
print(response.json())

Throttled Concurrent Requests

When scraping in parallel, use a semaphore to cap concurrency:

import asyncio
import httpx

async def fetch_throttled(urls, max_concurrent=5):
    semaphore = asyncio.Semaphore(max_concurrent)
    results = []

    async def fetch_one(client, url):
        async with semaphore:
            response = await client.get(url, timeout=15)
            await asyncio.sleep(0.5)  # Polite delay
            return response.json()

    async with httpx.AsyncClient() as client:
        tasks = [fetch_one(client, url) for url in urls]
        results = await asyncio.gather(*tasks)
    return results

urls = [f"https://api.example.com/item/{i}" for i in range(100)]
data = asyncio.run(fetch_throttled(urls))

Rate Limit Strategies

Strategy	When to Use
Fixed delay (`time.sleep`)	Known rate limit, low volume
Exponential backoff	Unknown or variable limits
Adaptive throttling	Reading rate-limit headers
Proxy rotation	Need to exceed per-IP limits

For scraping at volume beyond a single IP's rate limit, ScraperAPI rotates through millions of proxies, effectively multiplying your allowed request rate.

Next Steps

Scrape JSON API responses efficiently
Use async HTTPX for concurrent scraping
Build a complete API data pipeline with retry logic