Using ScraperAPI to Bypass Anti-Bot Protection
Complete guide to using ScraperAPI for web scraping with automatic proxy rotation, CAPTCHA solving, and anti-bot bypass.
ScraperAPI is a web scraping API that handles proxy rotation, CAPTCHA solving, and anti-bot bypass automatically. Instead of managing your own proxy infrastructure, you send requests through their API and get back clean HTML.
Getting Started
Sign up at scraperapi.com to get your API key. The free tier includes 5,000 API credits per month.
pip install requests
Method 1: API Endpoint
The simplest way to use ScraperAPI, pass the target URL as a parameter:
import requests
API_KEY = "YOUR_SCRAPERAPI_KEY"
response = requests.get(
"http://api.scraperapi.com",
params={
"api_key": API_KEY,
"url": "https://example.com",
},
timeout=60,
)
print(response.status_code)
print(response.text[:500])
Method 2: Proxy Port
Use ScraperAPI as a standard proxy, which works with any HTTP library:
import requests
API_KEY = "YOUR_SCRAPERAPI_KEY"
proxy = f"http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001"
response = requests.get(
"https://example.com",
proxies={"http": proxy, "https": proxy},
verify=False,
timeout=60,
)
print(response.text[:500])
Key Features
JavaScript Rendering
For sites that require a browser to render content:
response = requests.get(
"http://api.scraperapi.com",
params={
"api_key": API_KEY,
"url": "https://spa-website.com",
"render": "true",
},
timeout=90,
)
Geographic Targeting
Get results as seen from a specific country:
response = requests.get(
"http://api.scraperapi.com",
params={
"api_key": API_KEY,
"url": "https://example.com",
"country_code": "us",
},
)
Session Stickiness
Maintain the same IP across multiple requests using session numbers:
# All requests with the same session number use the same IP
for page in range(1, 6):
response = requests.get(
"http://api.scraperapi.com",
params={
"api_key": API_KEY,
"url": f"https://example.com/page/{page}",
"session_number": "12345",
},
)
print(f"Page {page}: {response.status_code}")
Using with Scrapy
# settings.py
SCRAPEOPS_API_KEY = "YOUR_SCRAPERAPI_KEY"
DOWNLOADER_MIDDLEWARES = {
"scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware": 110,
}
# In your spider
def start_requests(self):
api_key = self.settings.get("SCRAPEOPS_API_KEY")
target = "https://example.com"
url = f"http://api.scraperapi.com?api_key={api_key}&url={target}"
yield scrapy.Request(url=url, callback=self.parse)
Pricing Considerations
- Free tier: 5,000 credits/month
- JavaScript rendering costs 10 credits per request (vs 1 for standard)
- Residential proxies cost 25 credits per request
- Monitor your usage at the ScraperAPI dashboard
ScraperAPI is an excellent choice when you want to focus on data extraction rather than anti-bot infrastructure.