Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

2.5intermediate5 min read

Browser, Context, Page, The Mental Model

Three nested objects define every Playwright script. Get the relationship right and concurrency, isolation, and sessions all become obvious.

What you’ll learn

  • Define Browser, BrowserContext, and Page and the relationship between them.
  • Choose the right level of object reuse for a given scraper workload.
  • Configure user agent, viewport, locale, and timezone via BrowserContext options.
  • Avoid the two most common isolation mistakes: cross-context cookies and shared state.

Every Playwright script involves three nested objects: Browser → BrowserContext → Page. Until you internalise the relationship between them, you'll either leak state across requests or burn resources spinning up a fresh browser for every URL. Two failure modes, one mental model that fixes both.

The three objects

Object Real-world analogue Cost to create Isolation
Browser The Chromium process itself Slow (~1 s, several hundred MB) One per OS process
BrowserContext An incognito profile inside the browser Fast (~10 ms) Fully isolated cookies, storage, cache
Page A single tab inside that profile Fast (~10 ms) Shares state with its context

The relationship is one-to-many at every level:

Browser (1)
  ├── Context A (cookies, localStorage, cache for site 1)
  │  ├── Page 1
  │  └── Page 2
  └── Context B (cookies, localStorage, cache for site 2)
  ├── Page 3
  └── Page 4

Two pages inside the same context share cookies, like two tabs in normal browsing. Two pages in different contexts are isolated, like two separate incognito windows.

When to share what

The right choice depends on your workload:

Scenario Reuse what
Scraping 1000 URLs from one site One browser, one context, many pages (or one page reused)
Scraping 10 sites in parallel One browser, one context PER SITE, many pages per context
Multi-account scraping One context per account (so cookies stay separate)
Throwaway test of a single page Whole new browser each time is fine
Concurrent crawl with proxy rotation One context per proxy (proxy is set at context level)

The pattern people get wrong most often: launching a new browser per page. That's a 1-second penalty per URL, gone for no reason. Reuse the browser; create cheap contexts/pages as needed.

Creating contexts explicitly

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
  browser = p.chromium.launch()

  # Explicit context with options
  context = browser.new_context(
  user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 ...",
  viewport={"width": 1280, "height": 800},
  locale="en-US",
  timezone_id="America/New_York",
  )

  page = context.new_page()
  page.goto("https://practice.scrapingcentral.com/products")
  print(page.title())

  context.close()
  browser.close()

browser.new_context() is where you configure session-level options. The defaults are reasonable but rarely right for scraping, you almost always want to override at least the user agent.

When you skip new_context() and call browser.new_page() directly, Playwright creates an implicit default context. That works for one-off scripts but hides the isolation boundary you'll need later.

Context-level options that matter for scraping

Option What it does
user_agent The UA string. Playwright defaults expose "HeadlessChrome", change it.
viewport Window size in pixels. Affects responsive layouts and screenshots.
locale Browser language (Accept-Language header).
timezone_id IANA timezone (affects Date() and timezone-aware UI).
geolocation Mock GPS coords. Used with permissions=["geolocation"].
proxy Per-context proxy (overrides browser-level).
storage_state Pre-load cookies + localStorage from a previous session (Lesson 2.25).
bypass_csp Disable Content Security Policy enforcement. Useful when injecting scripts.
extra_http_headers Custom headers on every request from this context.

The killer feature is storage_state: log in once, save the cookies, reuse them in every subsequent context. That's Lesson 2.25's whole topic.

Multiple pages, one context

You can open several pages in one context. They share cookies, localStorage, and cache, like opening tabs in normal browsing:

context = browser.new_context()
page_listings = context.new_page()
page_listings.goto("https://practice.scrapingcentral.com/products")

# Click into a product but keep the listings page available
page_detail = context.new_page()
page_detail.goto("https://practice.scrapingcentral.com/products/1-white-wooden-vase")

When does this help? When you need to keep one page warm (logged in, on a search results page) while spawning detail pages off it. Or when an interaction on one tab triggers a popup that opens in a new tab, you handle that via context.on("page", handler).

Multiple contexts, one browser

The pattern for scraping multiple sites or multiple accounts simultaneously:

with sync_playwright() as p:
  browser = p.chromium.launch()

  contexts = [
  browser.new_context(proxy={"server": "http://us-proxy:8080"}),
  browser.new_context(proxy={"server": "http://uk-proxy:8080"}),
  browser.new_context(proxy={"server": "http://de-proxy:8080"}),
  ]

  for ctx in contexts:
  page = ctx.new_page()
  page.goto("https://practice.scrapingcentral.com/")
  print(ctx, page.locator("h1").first.inner_text())
  ctx.close()

  browser.close()

Three isolated sessions through three proxies, sharing one browser process. Per-context proxies arrived in Playwright 1.29; before that you had to launch a new browser per proxy.

The two isolation mistakes

Mistake 1: cookies leaking across "different" scrapes. You scrape Site A, then scrape Site B from the same context. The Set-Cookie headers from A are sent to B, polluting the session and sometimes triggering anti-bot rules. Fix: one context per site.

Mistake 2: tracking state across contexts. You try to share a logged-in session by passing cookies between contexts manually. Fix: use storage_state to serialize/deserialize cleanly (Lesson 2.25).

Concurrency pattern preview

Lesson 2.26 covers browser pools in depth, but the shape is:

async with async_playwright() as p:
  browser = await p.chromium.launch()
  tasks = []
  for url in urls:
  ctx = await browser.new_context()
  tasks.append(scrape(ctx, url))
  await asyncio.gather(*tasks)
  await browser.close()

One browser, N contexts, each context's lifetime spans one URL. Cheaper than N browsers; isolation cleaner than N pages on one context.

Hands-on lab

Open /products and /products?page=2 from the same context, then again from two different contexts. Inspect context.cookies() after both runs. You should see that same-context cookies persist while cross-context cookies are isolated. That's the entire mental model in one experiment.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /products

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Browser, Context, Page, The Mental Model1 / 8

Which Playwright object is roughly analogous to an incognito profile inside the browser?

Score so far: 0 / 0