Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Guide

How to Scrape Trustpilot Reviews

A complete guide to scraping Trustpilot reviews and ratings using Python. Extract business reviews for sentiment analysis and reputation monitoring.

Trustpilot hosts millions of business reviews, making it a key source for reputation monitoring and competitive analysis. Here is how to scrape it.

What Data to Extract

  • Star ratings, Overall and per-review scores (1-5)
  • Review text, Full written reviews
  • Reviewer info, Name, location, review count
  • Review date, When the review was posted
  • Company response, Whether the business replied
  • Verification status, Whether the review is verified

Basic Scraping Approach

Trustpilot pages are relatively well-structured and much of the data is in the initial HTML.

import requests
from bs4 import BeautifulSoup

url = "https://www.trustpilot.com/review/example.com"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"}

resp = requests.get(url, headers=headers)
soup = BeautifulSoup(resp.text, "html.parser")

reviews = soup.select("article[data-service-review-card-paper]")
for review in reviews:
    rating = review.select_one("div[data-service-review-rating]")
    text = review.select_one("p[data-service-review-text-typography]")
    print(rating.get("data-service-review-rating") if rating else "N/A")
    print(text.text.strip() if text else "N/A")

Scaling with ScraperAPI

For scraping multiple businesses or thousands of reviews, use ScraperAPI to avoid blocks.

API_KEY = "YOUR_SCRAPERAPI_KEY"

for page in range(1, 11):
    url = f"https://www.trustpilot.com/review/example.com?page={page}"
    resp = requests.get(
        f"http://api.scraperapi.com?api_key={API_KEY}&url={url}"
    )
    # Parse reviews from each page

Handling Pagination

Trustpilot uses simple page-number pagination. Each page shows 20 reviews. Just increment the page query parameter.

JSON-LD Data

Trustpilot embeds structured data in JSON-LD format, which is easier to parse than HTML.

import json

script = soup.find("script", type="application/ld+json")
if script:
    data = json.loads(script.string)
    print(data.get("aggregateRating"))

Best Practices

  1. Use JSON-LD first, It contains clean, structured review data
  2. Paginate through all reviews, Do not just scrape the first page
  3. Track review dates, Useful for sentiment trends over time
  4. Use ScrapingAnt for volume, Residential proxies help avoid detection
  5. Respect rate limits, Add 2-3 second delays between requests