Managing Browser Contexts and Sessions
Learn to manage browser contexts, sessions, cookies, and local storage in Playwright and Selenium for stateful web scraping.
Many scraping tasks require maintaining state across pages. You might need to stay logged in, preserve cookies, or simulate multiple users. Playwright's browser contexts and Selenium's cookie management give you fine-grained control over session state.
What Is a Browser Context?
A browser context in Playwright is like an incognito window. Each context has its own cookies, local storage, and session state. Multiple contexts can run within a single browser instance, isolated from each other.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
# Create two isolated contexts
context_a = browser.new_context()
context_b = browser.new_context()
page_a = context_a.new_page()
page_b = context_b.new_page()
# Log in as user A
page_a.goto("https://example.com/login")
page_a.fill("#email", "user_a@example.com")
page_a.fill("#password", "password_a")
page_a.click("button[type='submit']")
# Log in as user B (completely isolated session)
page_b.goto("https://example.com/login")
page_b.fill("#email", "user_b@example.com")
page_b.fill("#password", "password_b")
page_b.click("button[type='submit']")
# Both users are logged in simultaneously
browser.close()
Saving and Restoring Session State
Save cookies and storage state to disk so you can resume sessions later without logging in again:
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context()
page = context.new_page()
# Log in
page.goto("https://example.com/login")
page.fill("#email", "user@example.com")
page.fill("#password", "password")
page.click("button[type='submit']")
page.wait_for_load_state("networkidle")
# Save session state (cookies + local storage)
context.storage_state(path="session.json")
browser.close()
Restore the session later:
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
# Create context with saved state
context = browser.new_context(storage_state="session.json")
page = context.new_page()
# Navigate directly, already logged in
page.goto("https://example.com/dashboard")
print(page.title()) # Shows dashboard, not login page
browser.close()
Cookie Management in Playwright
# Get all cookies
cookies = context.cookies()
for cookie in cookies:
print(f"{cookie['name']}: {cookie['value']}")
# Add a cookie
context.add_cookies([{
"name": "session_token",
"value": "abc123",
"domain": "example.com",
"path": "/"
}])
# Clear cookies
context.clear_cookies()
Cookie Management in Selenium
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com")
# Get all cookies
cookies = driver.get_cookies()
for cookie in cookies:
print(f"{cookie['name']}: {cookie['value']}")
# Add a cookie
driver.add_cookie({
"name": "session_token",
"value": "abc123",
"domain": "example.com"
})
# Save cookies for later
import json
with open("cookies.json", "w") as f:
json.dump(driver.get_cookies(), f)
# Restore cookies in a new session
driver.get("https://example.com")
with open("cookies.json") as f:
cookies = json.load(f)
for cookie in cookies:
driver.add_cookie(cookie)
driver.refresh()
driver.quit()
Context with Custom Configuration
context = browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Custom User Agent String",
locale="en-US",
timezone_id="America/New_York",
geolocation={"latitude": 40.7128, "longitude": -74.0060},
permissions=["geolocation"],
extra_http_headers={"Accept-Language": "en-US,en;q=0.9"}
)
When to Use Contexts vs Separate Browsers
Use contexts when you need multiple isolated sessions that share the same browser process (faster, less memory). Use separate browsers when you need different browser types, different proxy configurations, or complete process isolation.
Next Steps
- Learn parallel browser scraping
- Explore Playwright in Python for large-scale scraping
- Set up Selenium Grid for distributed scraping