Using ScraperAPI to Bypass Anti-Bot Protection - Anti-Detection

Complete guide to using ScraperAPI for web scraping with automatic proxy rotation, CAPTCHA solving, and anti-bot bypass.

ScraperAPI is a web scraping API that handles proxy rotation, CAPTCHA solving, and anti-bot bypass automatically. Instead of managing your own proxy infrastructure, you send requests through their API and get back clean HTML.

Getting Started

pip install requests

Method 1: API Endpoint

The simplest way to use ScraperAPI, pass the target URL as a parameter:

import requests

API_KEY = "YOUR_SCRAPERAPI_KEY"

response = requests.get(
    "http://api.scraperapi.com",
    params={
        "api_key": API_KEY,
        "url": "https://example.com",
    },
    timeout=60,
)

print(response.status_code)
print(response.text[:500])

Method 2: Proxy Port

Use ScraperAPI as a standard proxy, which works with any HTTP library:

import requests

API_KEY = "YOUR_SCRAPERAPI_KEY"
proxy = f"http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001"

response = requests.get(
    "https://example.com",
    proxies={"http": proxy, "https": proxy},
    verify=False,
    timeout=60,
)
print(response.text[:500])

Key Features

JavaScript Rendering

For sites that require a browser to render content:

response = requests.get(
    "http://api.scraperapi.com",
    params={
        "api_key": API_KEY,
        "url": "https://spa-website.com",
        "render": "true",
    },
    timeout=90,
)

Geographic Targeting

Get results as seen from a specific country:

response = requests.get(
    "http://api.scraperapi.com",
    params={
        "api_key": API_KEY,
        "url": "https://example.com",
        "country_code": "us",
    },
)

Session Stickiness

Maintain the same IP across multiple requests using session numbers:

# All requests with the same session number use the same IP
for page in range(1, 6):
    response = requests.get(
        "http://api.scraperapi.com",
        params={
            "api_key": API_KEY,
            "url": f"https://example.com/page/{page}",
            "session_number": "12345",
        },
    )
    print(f"Page {page}: {response.status_code}")

Using with Scrapy

# settings.py
SCRAPEOPS_API_KEY = "YOUR_SCRAPERAPI_KEY"

DOWNLOADER_MIDDLEWARES = {
    "scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware": 110,
}

# In your spider
def start_requests(self):
    api_key = self.settings.get("SCRAPEOPS_API_KEY")
    target = "https://example.com"
    url = f"http://api.scraperapi.com?api_key={api_key}&url={target}"
    yield scrapy.Request(url=url, callback=self.parse)

Pricing Considerations

Free tier: 5,000 credits/month
JavaScript rendering costs 10 credits per request (vs 1 for standard)
Residential proxies cost 25 credits per request
Monitor your usage at the ScraperAPI dashboard

ScraperAPI is an excellent choice when you want to focus on data extraction rather than anti-bot infrastructure.