Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Guide

How to Scrape Indeed Job Listings

A practical guide to scraping Indeed job listings with Python, covering techniques for extracting job titles, salaries, descriptions, and company data.

Indeed is the largest job search engine, making it a valuable data source for recruitment analytics, salary research, and market intelligence. Here is how to scrape it effectively.

Challenges

Indeed has moderate anti-bot protections:

  • Rate limiting on repeated requests
  • CAPTCHA challenges for suspicious traffic
  • Dynamic JavaScript rendering for some content
  • IP-based blocking for heavy scrapers

Setting Up

pip install requests beautifulsoup4

Scraping Indeed with ScraperAPI

import requests
from bs4 import BeautifulSoup
import json
import time

API_KEY = "YOUR_SCRAPERAPI_KEY"

def scrape_indeed(query, location, pages=3):
    all_jobs = []

    for page in range(0, pages * 10, 10):
        url = f"https://www.indeed.com/jobs?q={query}&l={location}&start={page}"

        response = requests.get("https://api.scraperapi.com", params={
            "api_key": API_KEY,
            "url": url,
            "render": "true"
        })

        soup = BeautifulSoup(response.text, "html.parser")
        job_cards = soup.select(".job_seen_beacon")

        for card in job_cards:
            title_el = card.select_one("h2.jobTitle a span")
            company_el = card.select_one("[data-testid='company-name']")
            location_el = card.select_one("[data-testid='text-location']")
            salary_el = card.select_one(".salary-snippet-container")

            job = {
                "title": title_el.text.strip() if title_el else "N/A",
                "company": company_el.text.strip() if company_el else "N/A",
                "location": location_el.text.strip() if location_el else "N/A",
                "salary": salary_el.text.strip() if salary_el else "Not listed",
            }
            all_jobs.append(job)

        time.sleep(2)  # Respectful delay between pages

    return all_jobs

jobs = scrape_indeed("python developer", "New York, NY")
for job in jobs:
    print(f"{job['title']} at {job['company']} - {job['location']}")
    print(f"  Salary: {job['salary']}")

Scraping Job Descriptions

To get full job descriptions, you need to visit each individual job page:

def get_job_description(job_url):
    response = requests.get("https://api.scraperapi.com", params={
        "api_key": API_KEY,
        "url": job_url,
        "render": "true"
    })

    soup = BeautifulSoup(response.text, "html.parser")
    description = soup.select_one("#jobDescriptionText")

    if description:
        return description.text.strip()
    return "Description not available"

Using ScrapingAnt Alternative

import requests
from bs4 import BeautifulSoup

response = requests.get("https://api.scrapingant.com/v2/general", params={
    "x-api-key": "YOUR_SCRAPINGANT_KEY",
    "url": "https://www.indeed.com/jobs?q=data+engineer&l=Remote",
    "browser": "true"
})

html = response.json()["content"]
soup = BeautifulSoup(html, "html.parser")
# Parse as shown above

Best Practices

  1. Add delays between requests (2-5 seconds minimum)
  2. Use a scraping API for proxy rotation and anti-bot bypass
  3. Cache results to minimize redundant requests
  4. Respect Indeed's robots.txt directives
  5. Consider Indeed's Publisher API for legitimate affiliate use cases

Verdict

Indeed scraping is straightforward with the right tools. ScraperAPI with JavaScript rendering handles Indeed's protections well, delivering consistent results. ScrapingAnt is equally capable as an alternative. Always scrape responsibly and consider official APIs first.