Guide
How to Scrape Amazon Product Data with Python
A step-by-step guide to scraping Amazon product data including prices, reviews, and ratings using Python and ScraperAPI.
Amazon is one of the most commonly scraped websites for product data, pricing intelligence, and review analysis. However, it is also one of the most heavily protected. This guide shows you how to scrape Amazon reliably.
Why Scraping Amazon Is Hard
Amazon employs aggressive anti-bot measures including:
- CAPTCHA challenges on repeated requests
- IP-based rate limiting and blocking
- Dynamic HTML that changes frequently
- Bot detection based on browser fingerprinting
The Recommended Approach: ScraperAPI
ScraperAPI has a dedicated Amazon structured data endpoint that handles all anti-bot measures and returns clean, parsed JSON data. This is the most reliable method.
import requests
import json
API_KEY = "YOUR_SCRAPERAPI_KEY"
# Method 1: Structured data endpoint (recommended)
response = requests.get("https://api.scraperapi.com/structured/amazon/product", params={
"api_key": API_KEY,
"asin": "B09V3KXJPB",
"country": "us"
})
product = response.json()
print(f"Name: {product['name']}")
print(f"Price: {product['pricing']}")
print(f"Rating: {product['rating']}")
print(f"Reviews: {product['total_reviews']}")
Method 2: Raw HTML Scraping
If you need custom data extraction, you can scrape the raw HTML through ScraperAPI or ScrapingAnt:
import requests
from bs4 import BeautifulSoup
# Using ScraperAPI for proxy and anti-bot handling
response = requests.get("https://api.scraperapi.com", params={
"api_key": "YOUR_SCRAPERAPI_KEY",
"url": "https://www.amazon.com/dp/B09V3KXJPB",
"render": "true"
})
soup = BeautifulSoup(response.text, "html.parser")
title = soup.select_one("#productTitle")
price = soup.select_one(".a-price .a-offscreen")
rating = soup.select_one("#acrPopover")
if title:
print(f"Title: {title.text.strip()}")
if price:
print(f"Price: {price.text.strip()}")
if rating:
print(f"Rating: {rating.get('title', 'N/A')}")
Scraping Multiple Products
For scraping product listings from search results or category pages:
import requests
from bs4 import BeautifulSoup
API_KEY = "YOUR_SCRAPERAPI_KEY"
def scrape_amazon_search(keyword, pages=3):
all_products = []
for page in range(1, pages + 1):
url = f"https://www.amazon.com/s?k={keyword}&page={page}"
response = requests.get("https://api.scraperapi.com", params={
"api_key": API_KEY,
"url": url,
"render": "true"
})
soup = BeautifulSoup(response.text, "html.parser")
results = soup.select('[data-component-type="s-search-result"]')
for result in results:
name_el = result.select_one("h2 a span")
price_el = result.select_one(".a-price .a-offscreen")
if name_el:
all_products.append({
"name": name_el.text.strip(),
"price": price_el.text.strip() if price_el else "N/A"
})
return all_products
products = scrape_amazon_search("wireless headphones")
for p in products:
print(f"{p['name']} - {p['price']}")
Legal Considerations
Amazon's Terms of Service restrict automated data collection. Use scraped data responsibly, respect rate limits, and consider whether Amazon's official Product Advertising API meets your needs before scraping.
Verdict
For reliable Amazon scraping, ScraperAPI's structured data endpoint is the easiest and most reliable option. ScrapingAnt is a solid alternative for raw HTML scraping with headless Chrome rendering. Avoid scraping Amazon without proxy protection, you will be blocked quickly.