Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Guide

How to Scrape Yelp Reviews

A step-by-step guide to scraping Yelp business reviews, ratings, and business data using Python and web scraping APIs.

Yelp contains millions of business reviews useful for sentiment analysis, competitive intelligence, and market research. Here is how to scrape Yelp data effectively.

Challenges

Yelp has moderate-to-strong anti-bot protections:

  • CAPTCHAs for suspicious traffic patterns
  • Rate limiting by IP address
  • Dynamic content loaded with JavaScript
  • Review filtering that hides some content

Yelp Fusion API (Official)

Yelp's Fusion API provides business search and details, though not full review text:

import requests

API_KEY = "YOUR_YELP_API_KEY"

headers = {"Authorization": f"Bearer {API_KEY}"}

# Search for businesses
response = requests.get("https://api.yelp.com/v3/businesses/search", params={
    "term": "pizza",
    "location": "New York City",
    "limit": 10,
    "sort_by": "rating"
}, headers=headers)

for biz in response.json()["businesses"]:
    print(f"{biz['name']} - {biz['rating']} stars ({biz['review_count']} reviews)")
    print(f"  {biz['location']['display_address']}")

Scraping Full Reviews with ScraperAPI

To get full review text, you need to scrape the review pages:

import requests
from bs4 import BeautifulSoup
import time

API_KEY = "YOUR_SCRAPERAPI_KEY"

def scrape_yelp_reviews(business_url, pages=3):
    all_reviews = []

    for page in range(pages):
        start = page * 10
        url = f"{business_url}?start={start}"

        response = requests.get("https://api.scraperapi.com", params={
            "api_key": API_KEY,
            "url": url,
            "render": "true"
        })

        soup = BeautifulSoup(response.text, "html.parser")
        reviews = soup.select('[class*="review__"]')

        for review in reviews:
            rating_el = review.select_one('[aria-label*="star rating"]')
            text_el = review.select_one("p[lang]")
            date_el = review.select_one("span[class*='date']")

            all_reviews.append({
                "rating": rating_el.get("aria-label", "") if rating_el else "N/A",
                "text": text_el.text.strip() if text_el else "N/A",
                "date": date_el.text.strip() if date_el else "N/A"
            })

        time.sleep(2)

    return all_reviews

reviews = scrape_yelp_reviews("https://www.yelp.com/biz/example-restaurant-new-york")
for r in reviews:
    print(f"{r['rating']} - {r['text'][:100]}...")

Using ScrapingAnt

import requests
from bs4 import BeautifulSoup

response = requests.get("https://api.scrapingant.com/v2/general", params={
    "x-api-key": "YOUR_SCRAPINGANT_KEY",
    "url": "https://www.yelp.com/biz/example-restaurant-new-york",
    "browser": "true",
    "proxy_type": "residential"
})

html = response.json()["content"]
soup = BeautifulSoup(html, "html.parser")

# Parse reviews from the rendered HTML
business_name = soup.select_one("h1")
if business_name:
    print(f"Business: {business_name.text.strip()}")

Data Points Available

  • Business name, address, phone number
  • Overall rating and review count
  • Individual review text, rating, and date
  • Reviewer name and review count
  • Business hours, price range, categories
  • Photos (URLs)

Best Practices

  1. Start with Yelp's Fusion API for business search data
  2. Use ScraperAPI or ScrapingAnt for full review text
  3. Respect rate limits with delays between requests
  4. Cache results to avoid redundant scraping
  5. Be mindful of Yelp's Terms of Service

Verdict

Yelp's Fusion API is great for business data but lacks full review text. For complete review scraping, ScraperAPI with JavaScript rendering provides the most reliable results. ScrapingAnt is a solid alternative. Always combine the official API with scraping for the best coverage.