Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Guide

Complete Guide to E-Commerce Scraping

Everything you need to know about scraping e-commerce websites. Covers Amazon, Shopify, eBay, and other platforms with practical techniques.

E-commerce scraping is the most common commercial use of web scraping. This guide covers techniques for extracting product data from any online store.

What E-Commerce Data to Scrape

Data Point Business Value
Product names Catalog comparison
Prices Competitive pricing
Reviews and ratings Quality analysis
Product images Visual comparison
Availability Inventory monitoring
Descriptions Content analysis
Categories Taxonomy mapping
Seller info Marketplace analysis

Platform-Specific Approaches

Amazon

Amazon is the hardest e-commerce site to scrape. Use ScraperAPI with its built-in Amazon parser.

import requests

API_KEY = "YOUR_SCRAPERAPI_KEY"
asin = "B0XXXXXXXXX"
url = f"https://www.amazon.com/dp/{asin}"

resp = requests.get(
    f"http://api.scraperapi.com?api_key={API_KEY}&url={url}&autoparse=true"
)
product = resp.json()  # Structured product data

Shopify Stores

The easiest targets, use the built-in JSON API.

resp = requests.get("https://store.com/products.json?limit=250")
products = resp.json()["products"]

eBay

eBay data is accessible but paginated. Use completed listings for pricing intelligence.

Walmart

Requires JavaScript rendering and anti-bot bypass. Use ScraperAPI.

Common E-Commerce Scraping Challenges

  1. Anti-bot protection, Most major platforms actively block scrapers
  2. Dynamic pricing, Prices change based on location, time, and user
  3. Infinite scroll, Product listings load as you scroll
  4. A/B testing, Different users see different layouts
  5. Rate limiting, Aggressive request limits

Building a Price Monitor

import requests
import json
from datetime import datetime

API_KEY = "YOUR_SCRAPERAPI_KEY"

def track_price(product_url):
    resp = requests.get(
        f"http://api.scraperapi.com?api_key={API_KEY}&url={product_url}&render=true"
    )
    # Extract price from response
    # Store with timestamp
    record = {
        "url": product_url,
        "timestamp": datetime.now().isoformat(),
        "price": extracted_price,
    }
    
    with open("price_history.jsonl", "a") as f:
        f.write(json.dumps(record) + "\n")

Tools for E-Commerce Scraping

Tool Best For Pricing
ScraperAPI All platforms From $49/mo
ScrapingAnt Budget scraping From $19/mo
Scrapy Custom crawlers Free
Playwright JS-heavy stores Free

Best Practices

  1. Use structured data, JSON-LD, microdata, and product APIs are more reliable than HTML parsing
  2. Monitor for layout changes, E-commerce sites redesign frequently
  3. Track prices over time, Single snapshots are less valuable than trends
  4. Deduplicate products, The same product appears on multiple pages
  5. Handle variants, Products have sizes, colors, and other options
  6. Respect the platform, Scrape responsibly and do not overload servers