How to Scrape Amazon Product Data with Python

A step-by-step guide to scraping Amazon product data including prices, reviews, and ratings using Python and ScraperAPI.

Amazon is one of the most commonly scraped websites for product data, pricing intelligence, and review analysis. However, it is also one of the most heavily protected. This guide shows you how to scrape Amazon reliably.

Why Scraping Amazon Is Hard

Amazon employs aggressive anti-bot measures including:

CAPTCHA challenges on repeated requests
IP-based rate limiting and blocking
Dynamic HTML that changes frequently
Bot detection based on browser fingerprinting

The Recommended Approach: ScraperAPI

ScraperAPI has a dedicated Amazon structured data endpoint that handles all anti-bot measures and returns clean, parsed JSON data. This is the most reliable method.

import requests
import json

API_KEY = "YOUR_SCRAPERAPI_KEY"

# Method 1: Structured data endpoint (recommended)
response = requests.get("https://api.scraperapi.com/structured/amazon/product", params={
    "api_key": API_KEY,
    "asin": "B09V3KXJPB",
    "country": "us"
})

product = response.json()
print(f"Name: {product['name']}")
print(f"Price: {product['pricing']}")
print(f"Rating: {product['rating']}")
print(f"Reviews: {product['total_reviews']}")

Method 2: Raw HTML Scraping

If you need custom data extraction, you can scrape the raw HTML through ScraperAPI or ScrapingAnt:

import requests
from bs4 import BeautifulSoup

# Using ScraperAPI for proxy and anti-bot handling
response = requests.get("https://api.scraperapi.com", params={
    "api_key": "YOUR_SCRAPERAPI_KEY",
    "url": "https://www.amazon.com/dp/B09V3KXJPB",
    "render": "true"
})

soup = BeautifulSoup(response.text, "html.parser")

title = soup.select_one("#productTitle")
price = soup.select_one(".a-price .a-offscreen")
rating = soup.select_one("#acrPopover")

if title:
    print(f"Title: {title.text.strip()}")
if price:
    print(f"Price: {price.text.strip()}")
if rating:
    print(f"Rating: {rating.get('title', 'N/A')}")

Scraping Multiple Products

For scraping product listings from search results or category pages:

import requests
from bs4 import BeautifulSoup

API_KEY = "YOUR_SCRAPERAPI_KEY"

def scrape_amazon_search(keyword, pages=3):
    all_products = []

    for page in range(1, pages + 1):
        url = f"https://www.amazon.com/s?k={keyword}&page={page}"
        response = requests.get("https://api.scraperapi.com", params={
            "api_key": API_KEY,
            "url": url,
            "render": "true"
        })

        soup = BeautifulSoup(response.text, "html.parser")
        results = soup.select('[data-component-type="s-search-result"]')

        for result in results:
            name_el = result.select_one("h2 a span")
            price_el = result.select_one(".a-price .a-offscreen")

            if name_el:
                all_products.append({
                    "name": name_el.text.strip(),
                    "price": price_el.text.strip() if price_el else "N/A"
                })

    return all_products

products = scrape_amazon_search("wireless headphones")
for p in products:
    print(f"{p['name']} - {p['price']}")

Legal Considerations

Amazon's Terms of Service restrict automated data collection. Use scraped data responsibly, respect rate limits, and consider whether Amazon's official Product Advertising API meets your needs before scraping.

Verdict

For reliable Amazon scraping, ScraperAPI's structured data endpoint is the easiest and most reliable option. ScrapingAnt is a solid alternative for raw HTML scraping with headless Chrome rendering. Avoid scraping Amazon without proxy protection, you will be blocked quickly.