Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Guide

How to Scrape Glassdoor Reviews and Salaries

Learn how to scrape Glassdoor company reviews, salary data, and interview questions using Python and scraping APIs.

Glassdoor is a valuable source of company reviews, salary benchmarks, and interview data. Scraping it effectively requires dealing with login walls and dynamic content.

Useful Data on Glassdoor

  • Company reviews, Employee ratings and written reviews
  • Salary reports, Pay ranges by role and location
  • Interview questions, Common questions and difficulty ratings
  • Company info, Size, revenue, industry, headquarters
  • Job listings, Open positions and requirements

The Challenge

Glassdoor is one of the harder sites to scrape:

  • Login required for most content
  • Heavy JavaScript rendering, Data loads dynamically
  • Aggressive bot detection, CAPTCHAs and IP blocking
  • Rate limiting, Strict request limits

Recommended Approach: ScraperAPI

Given these challenges, using ScraperAPI is the most practical approach. It handles rendering, proxies, and anti-bot bypass.

import requests
from bs4 import BeautifulSoup

API_KEY = "YOUR_SCRAPERAPI_KEY"
company_url = "https://www.glassdoor.com/Reviews/Google-Reviews-E9079.htm"

resp = requests.get(
    f"http://api.scraperapi.com?api_key={API_KEY}&url={company_url}&render=true"
)
soup = BeautifulSoup(resp.text, "html.parser")

Extracting Review Data

Reviews typically contain:

# Example structure (actual selectors may vary)
reviews = soup.select("[data-test='reviewsList'] li")
for review in reviews:
    rating = review.select_one(".ratingNumber")
    title = review.select_one(".reviewLink")
    pros = review.select_one("[data-test='pros']")
    cons = review.select_one("[data-test='cons']")

Salary Data Extraction

Glassdoor salary pages show pay ranges by role. The data is often embedded in JSON within the page source.

Data Point Notes
Base pay range Median, low, high
Total compensation Including bonuses, stock
Pay by experience Entry, mid, senior
Location adjustment Pay varies by city

Alternative: ScrapingAnt

ScrapingAnt also handles Glassdoor well, with built-in JavaScript rendering and residential proxy support.

Best Practices

  1. Use rendered scraping, Glassdoor content is JavaScript-heavy
  2. Handle pagination, Reviews span many pages
  3. Aggregate data, Individual reviews are noisy; look for patterns across many reviews
  4. Cache responses, Avoid re-scraping the same pages
  5. Stay ethical, Do not use scraped data to identify individual reviewers