Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Selenium: Handling JavaScript-Rendered Pages

Learn how to scrape JavaScript-rendered pages with Selenium. Handle dynamic content, AJAX calls, and single-page applications.

Browser Automation · #5intermediate2 min read
Share:WhatsAppLinkedIn

Many modern websites load their content dynamically using JavaScript. When you fetch these pages with a simple HTTP request, you get an empty shell. Selenium solves this by running a real browser that executes JavaScript, just like a human visitor would see.

The Problem

import requests
from bs4 import BeautifulSoup

# This returns an empty page because content is loaded via JS
resp = requests.get("https://quotes.toscrape.com/js/")
soup = BeautifulSoup(resp.text, "html.parser")
quotes = soup.select(".quote")
print(len(quotes))  # 0, no quotes found!

The Selenium Solution

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("--headless")

driver = webdriver.Chrome(options=options)
driver.get("https://quotes.toscrape.com/js/")

# Wait for JS to render the quotes
wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".quote")))

quotes = driver.find_elements(By.CSS_SELECTOR, ".quote")
print(len(quotes))  # 10, all quotes found!

for quote in quotes:
    text = quote.find_element(By.CSS_SELECTOR, ".text").text
    author = quote.find_element(By.CSS_SELECTOR, ".author").text
    print(f"{text}, {author}")

driver.quit()

Waiting for AJAX Requests

Some pages load data via AJAX after the initial page load. You can wait for specific conditions:

# Wait until a loading spinner disappears
wait.until(EC.invisibility_of_element_located(
    (By.CSS_SELECTOR, ".loading")
))

# Wait until a specific number of elements appear
wait.until(lambda d: len(d.find_elements(By.CSS_SELECTOR, ".item")) >= 20)

# Wait for text to appear in an element
wait.until(EC.text_to_be_present_in_element(
    (By.CSS_SELECTOR, "#status"), "Complete"
))

Executing Custom JavaScript

Sometimes you need to run JavaScript directly to trigger content loading or extract data from JS variables:

# Scroll to bottom to trigger lazy loading
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")

# Extract data from a JavaScript variable
data = driver.execute_script("return window.__INITIAL_DATA__")

# Get computed styles or hidden attributes
color = driver.execute_script(
    "return getComputedStyle(arguments[0]).color",
    driver.find_element(By.CSS_SELECTOR, ".price")
)

Getting the Rendered Page Source

After JavaScript has executed, you can get the fully rendered HTML:

rendered_html = driver.page_source

# Now parse with BeautifulSoup if you prefer
from bs4 import BeautifulSoup
soup = BeautifulSoup(rendered_html, "html.parser")

Easier Alternative

If you need rendered HTML without managing browsers, ScrapingAnt and ScraperAPI both offer JavaScript rendering as a service. Send them a URL and get back the fully rendered page source via a simple API call, no browser management required.

Next Steps

  • Learn to take screenshots and PDFs with Playwright
  • Handle infinite scroll pages
  • Set up Selenium with proxies