Using ScrapingAnt with Python
Integrate ScrapingAnt into your Python scrapers for headless browser rendering, proxy rotation, and anti-bot bypass. Complete tutorial with examples.
Python Scraping · #16beginner3 min read
ScrapingAnt is a web scraping API that uses real headless browsers to render pages, making it excellent for JavaScript-heavy sites. It handles proxy rotation, browser fingerprinting, and anti-bot bypass out of the box.
Getting Started
- Sign up at scrapingant.com to get your API token
- Install the official Python client:
pip install scrapingant-client
Method 1: Using the Official Client
from scrapingant_client import ScrapingAntClient
client = ScrapingAntClient(token="YOUR_SCRAPINGANT_TOKEN")
result = client.general_request("https://quotes.toscrape.com/")
print(f"Status: {result.status_code}")
print(f"Content length: {len(result.content)}")
# Parse with BeautifulSoup
from bs4 import BeautifulSoup
soup = BeautifulSoup(result.content, "html.parser")
for quote in soup.select("div.quote"):
text = quote.select_one("span.text").get_text()
author = quote.select_one("small.author").get_text()
print(f"{author}: {text[:50]}...")
Method 2: Using the REST API Directly
You can also call ScrapingAnt's API with plain requests.
import requests
from bs4 import BeautifulSoup
API_TOKEN = "YOUR_SCRAPINGANT_TOKEN"
response = requests.get(
"https://api.scrapingant.com/v2/general",
params={
"url": "https://quotes.toscrape.com/",
"x-api-key": API_TOKEN,
},
)
if response.status_code == 200:
soup = BeautifulSoup(response.text, "html.parser")
quotes = soup.select("div.quote")
print(f"Found {len(quotes)} quotes")
else:
print(f"Error: {response.status_code} - {response.text}")
JavaScript Rendering
ScrapingAnt renders JavaScript by default. For SPAs and dynamic sites, you get the fully rendered HTML.
from scrapingant_client import ScrapingAntClient
from bs4 import BeautifulSoup
client = ScrapingAntClient(token="YOUR_SCRAPINGANT_TOKEN")
# ScrapingAnt renders JS automatically, no extra config needed
result = client.general_request(
"https://example.com/spa-page",
browser=True, # Use headless browser
)
soup = BeautifulSoup(result.content, "html.parser")
# Dynamic content is now available in the HTML
Setting Custom Cookies and Headers
from scrapingant_client import ScrapingAntClient
client = ScrapingAntClient(token="YOUR_SCRAPINGANT_TOKEN")
result = client.general_request(
"https://example.com/localized-page",
cookies=[
{"name": "language", "value": "en"},
{"name": "region", "value": "US"},
],
headers={
"Accept-Language": "en-US,en;q=0.9",
},
)
print(result.content[:500])
Scraping Multiple Pages
from scrapingant_client import ScrapingAntClient
from bs4 import BeautifulSoup
import time
client = ScrapingAntClient(token="YOUR_SCRAPINGANT_TOKEN")
all_quotes = []
for page in range(1, 6):
url = f"https://quotes.toscrape.com/page/{page}/"
try:
result = client.general_request(url)
soup = BeautifulSoup(result.content, "html.parser")
for quote in soup.select("div.quote"):
all_quotes.append({
"text": quote.select_one("span.text").get_text(),
"author": quote.select_one("small.author").get_text(),
})
print(f"Page {page}: scraped {len(soup.select('div.quote'))} quotes")
except Exception as e:
print(f"Page {page} failed: {e}")
time.sleep(1)
print(f"Total: {len(all_quotes)} quotes")
ScrapingAnt Features
| Feature | Description |
|---|---|
| JS rendering | Real headless Chrome browser |
| Proxy rotation | Automatic IP rotation across requests |
| Anti-bot bypass | Handles CAPTCHAs and bot detection |
| Custom cookies | Set cookies for localization and sessions |
| Screenshot API | Capture page screenshots |
| Geo-targeting | Route requests through specific countries |
Tips
- ScrapingAnt renders JavaScript by default, which is ideal for modern SPAs.
- Use the official
scrapingant-clientpackage for cleaner code and automatic error handling. - The free tier gives you 10,000 API credits per month, enough for testing and small projects.
- For static sites that do not need JavaScript rendering, you can disable the browser to save credits and speed up requests.
Next Steps
- Learn lxml and XPath for high-performance HTML parsing
- Build a complete price monitoring scraper using ScrapingAnt