Handling Rate Limiting in API Scraping
Learn how to detect, handle, and work around API rate limits using backoff strategies, concurrent throttling, and proxy rotation.
Rate limiting is how APIs protect themselves from abuse. When you exceed the allowed request frequency, the server returns a 429 Too Many Requests status. Handling this gracefully is essential for reliable scraping.
Detecting Rate Limits
Look for these signals in API responses:
import requests
response = requests.get("https://api.example.com/data", timeout=15)
# Check status code
if response.status_code == 429:
retry_after = response.headers.get("Retry-After", 60)
print(f"Rate limited. Retry after {retry_after} seconds.")
# Many APIs include rate limit headers
print(f"Remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Reset at: {response.headers.get('X-RateLimit-Reset')}")
Exponential Backoff
The standard approach, wait progressively longer after each failure:
import requests
import time
def fetch_with_backoff(url, max_retries=5):
for attempt in range(max_retries):
response = requests.get(url, timeout=15)
if response.status_code == 200:
return response.json()
if response.status_code == 429:
wait_time = min(2 ** attempt, 60) # 1s, 2s, 4s, 8s, 16s...
retry_after = response.headers.get("Retry-After")
if retry_after:
wait_time = int(retry_after)
print(f"Rate limited. Waiting {wait_time}s (attempt {attempt + 1})")
time.sleep(wait_time)
else:
response.raise_for_status()
raise Exception(f"Failed after {max_retries} retries")
data = fetch_with_backoff("https://api.example.com/items")
Using the tenacity Library
For production scrapers, tenacity provides clean retry decorators:
import requests
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_result
def is_rate_limited(response):
return response.status_code == 429
@retry(
retry=retry_if_result(is_rate_limited),
wait=wait_exponential(multiplier=1, min=2, max=60),
stop=stop_after_attempt(6),
)
def fetch(url):
return requests.get(url, timeout=15)
response = fetch("https://api.example.com/data")
print(response.json())
Throttled Concurrent Requests
When scraping in parallel, use a semaphore to cap concurrency:
import asyncio
import httpx
async def fetch_throttled(urls, max_concurrent=5):
semaphore = asyncio.Semaphore(max_concurrent)
results = []
async def fetch_one(client, url):
async with semaphore:
response = await client.get(url, timeout=15)
await asyncio.sleep(0.5) # Polite delay
return response.json()
async with httpx.AsyncClient() as client:
tasks = [fetch_one(client, url) for url in urls]
results = await asyncio.gather(*tasks)
return results
urls = [f"https://api.example.com/item/{i}" for i in range(100)]
data = asyncio.run(fetch_throttled(urls))
Rate Limit Strategies
| Strategy | When to Use |
|---|---|
Fixed delay (time.sleep) |
Known rate limit, low volume |
| Exponential backoff | Unknown or variable limits |
| Adaptive throttling | Reading rate-limit headers |
| Proxy rotation | Need to exceed per-IP limits |
For scraping at volume beyond a single IP's rate limit, ScraperAPI rotates through millions of proxies, effectively multiplying your allowed request rate.
Next Steps
- Scrape JSON API responses efficiently
- Use async HTTPX for concurrent scraping
- Build a complete API data pipeline with retry logic