How to Scrape Yelp Reviews

A step-by-step guide to scraping Yelp business reviews, ratings, and business data using Python and web scraping APIs.

Yelp contains millions of business reviews useful for sentiment analysis, competitive intelligence, and market research. Here is how to scrape Yelp data effectively.

Challenges

Yelp has moderate-to-strong anti-bot protections:

CAPTCHAs for suspicious traffic patterns
Rate limiting by IP address
Dynamic content loaded with JavaScript
Review filtering that hides some content

Yelp Fusion API (Official)

Yelp's Fusion API provides business search and details, though not full review text:

import requests

API_KEY = "YOUR_YELP_API_KEY"

headers = {"Authorization": f"Bearer {API_KEY}"}

# Search for businesses
response = requests.get("https://api.yelp.com/v3/businesses/search", params={
    "term": "pizza",
    "location": "New York City",
    "limit": 10,
    "sort_by": "rating"
}, headers=headers)

for biz in response.json()["businesses"]:
    print(f"{biz['name']} - {biz['rating']} stars ({biz['review_count']} reviews)")
    print(f"  {biz['location']['display_address']}")

Scraping Full Reviews with ScraperAPI

To get full review text, you need to scrape the review pages:

import requests
from bs4 import BeautifulSoup
import time

API_KEY = "YOUR_SCRAPERAPI_KEY"

def scrape_yelp_reviews(business_url, pages=3):
    all_reviews = []

    for page in range(pages):
        start = page * 10
        url = f"{business_url}?start={start}"

        response = requests.get("https://api.scraperapi.com", params={
            "api_key": API_KEY,
            "url": url,
            "render": "true"
        })

        soup = BeautifulSoup(response.text, "html.parser")
        reviews = soup.select('[class*="review__"]')

        for review in reviews:
            rating_el = review.select_one('[aria-label*="star rating"]')
            text_el = review.select_one("p[lang]")
            date_el = review.select_one("span[class*='date']")

            all_reviews.append({
                "rating": rating_el.get("aria-label", "") if rating_el else "N/A",
                "text": text_el.text.strip() if text_el else "N/A",
                "date": date_el.text.strip() if date_el else "N/A"
            })

        time.sleep(2)

    return all_reviews

reviews = scrape_yelp_reviews("https://www.yelp.com/biz/example-restaurant-new-york")
for r in reviews:
    print(f"{r['rating']} - {r['text'][:100]}...")

Using ScrapingAnt

import requests
from bs4 import BeautifulSoup

response = requests.get("https://api.scrapingant.com/v2/general", params={
    "x-api-key": "YOUR_SCRAPINGANT_KEY",
    "url": "https://www.yelp.com/biz/example-restaurant-new-york",
    "browser": "true",
    "proxy_type": "residential"
})

html = response.json()["content"]
soup = BeautifulSoup(html, "html.parser")

# Parse reviews from the rendered HTML
business_name = soup.select_one("h1")
if business_name:
    print(f"Business: {business_name.text.strip()}")

Data Points Available

Business name, address, phone number
Overall rating and review count
Individual review text, rating, and date
Reviewer name and review count
Business hours, price range, categories
Photos (URLs)

Best Practices

Start with Yelp's Fusion API for business search data
Use ScraperAPI or ScrapingAnt for full review text
Respect rate limits with delays between requests
Cache results to avoid redundant scraping
Be mindful of Yelp's Terms of Service

Verdict

Yelp's Fusion API is great for business data but lacks full review text. For complete review scraping, ScraperAPI with JavaScript rendering provides the most reliable results. ScrapingAnt is a solid alternative. Always combine the official API with scraping for the best coverage.