Browser, Context, Page, The Mental Model
Three nested objects define every Playwright script. Get the relationship right and concurrency, isolation, and sessions all become obvious.
What you’ll learn
- Define Browser, BrowserContext, and Page and the relationship between them.
- Choose the right level of object reuse for a given scraper workload.
- Configure user agent, viewport, locale, and timezone via BrowserContext options.
- Avoid the two most common isolation mistakes: cross-context cookies and shared state.
Every Playwright script involves three nested objects: Browser → BrowserContext → Page. Until you internalise the relationship between them, you'll either leak state across requests or burn resources spinning up a fresh browser for every URL. Two failure modes, one mental model that fixes both.
The three objects
| Object | Real-world analogue | Cost to create | Isolation |
|---|---|---|---|
| Browser | The Chromium process itself | Slow (~1 s, several hundred MB) | One per OS process |
| BrowserContext | An incognito profile inside the browser | Fast (~10 ms) | Fully isolated cookies, storage, cache |
| Page | A single tab inside that profile | Fast (~10 ms) | Shares state with its context |
The relationship is one-to-many at every level:
Browser (1)
├── Context A (cookies, localStorage, cache for site 1)
│ ├── Page 1
│ └── Page 2
└── Context B (cookies, localStorage, cache for site 2)
├── Page 3
└── Page 4
Two pages inside the same context share cookies, like two tabs in normal browsing. Two pages in different contexts are isolated, like two separate incognito windows.
When to share what
The right choice depends on your workload:
| Scenario | Reuse what |
|---|---|
| Scraping 1000 URLs from one site | One browser, one context, many pages (or one page reused) |
| Scraping 10 sites in parallel | One browser, one context PER SITE, many pages per context |
| Multi-account scraping | One context per account (so cookies stay separate) |
| Throwaway test of a single page | Whole new browser each time is fine |
| Concurrent crawl with proxy rotation | One context per proxy (proxy is set at context level) |
The pattern people get wrong most often: launching a new browser per page. That's a 1-second penalty per URL, gone for no reason. Reuse the browser; create cheap contexts/pages as needed.
Creating contexts explicitly
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
# Explicit context with options
context = browser.new_context(
user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 ...",
viewport={"width": 1280, "height": 800},
locale="en-US",
timezone_id="America/New_York",
)
page = context.new_page()
page.goto("https://practice.scrapingcentral.com/products")
print(page.title())
context.close()
browser.close()
browser.new_context() is where you configure session-level options. The defaults are reasonable but rarely right for scraping, you almost always want to override at least the user agent.
When you skip new_context() and call browser.new_page() directly, Playwright creates an implicit default context. That works for one-off scripts but hides the isolation boundary you'll need later.
Context-level options that matter for scraping
| Option | What it does |
|---|---|
user_agent |
The UA string. Playwright defaults expose "HeadlessChrome", change it. |
viewport |
Window size in pixels. Affects responsive layouts and screenshots. |
locale |
Browser language (Accept-Language header). |
timezone_id |
IANA timezone (affects Date() and timezone-aware UI). |
geolocation |
Mock GPS coords. Used with permissions=["geolocation"]. |
proxy |
Per-context proxy (overrides browser-level). |
storage_state |
Pre-load cookies + localStorage from a previous session (Lesson 2.25). |
bypass_csp |
Disable Content Security Policy enforcement. Useful when injecting scripts. |
extra_http_headers |
Custom headers on every request from this context. |
The killer feature is storage_state: log in once, save the cookies, reuse them in every subsequent context. That's Lesson 2.25's whole topic.
Multiple pages, one context
You can open several pages in one context. They share cookies, localStorage, and cache, like opening tabs in normal browsing:
context = browser.new_context()
page_listings = context.new_page()
page_listings.goto("https://practice.scrapingcentral.com/products")
# Click into a product but keep the listings page available
page_detail = context.new_page()
page_detail.goto("https://practice.scrapingcentral.com/products/1-white-wooden-vase")
When does this help? When you need to keep one page warm (logged in, on a search results page) while spawning detail pages off it. Or when an interaction on one tab triggers a popup that opens in a new tab, you handle that via context.on("page", handler).
Multiple contexts, one browser
The pattern for scraping multiple sites or multiple accounts simultaneously:
with sync_playwright() as p:
browser = p.chromium.launch()
contexts = [
browser.new_context(proxy={"server": "http://us-proxy:8080"}),
browser.new_context(proxy={"server": "http://uk-proxy:8080"}),
browser.new_context(proxy={"server": "http://de-proxy:8080"}),
]
for ctx in contexts:
page = ctx.new_page()
page.goto("https://practice.scrapingcentral.com/")
print(ctx, page.locator("h1").first.inner_text())
ctx.close()
browser.close()
Three isolated sessions through three proxies, sharing one browser process. Per-context proxies arrived in Playwright 1.29; before that you had to launch a new browser per proxy.
The two isolation mistakes
Mistake 1: cookies leaking across "different" scrapes. You scrape Site A, then scrape Site B from the same context. The Set-Cookie headers from A are sent to B, polluting the session and sometimes triggering anti-bot rules. Fix: one context per site.
Mistake 2: tracking state across contexts. You try to share a logged-in session by passing cookies between contexts manually. Fix: use storage_state to serialize/deserialize cleanly (Lesson 2.25).
Concurrency pattern preview
Lesson 2.26 covers browser pools in depth, but the shape is:
async with async_playwright() as p:
browser = await p.chromium.launch()
tasks = []
for url in urls:
ctx = await browser.new_context()
tasks.append(scrape(ctx, url))
await asyncio.gather(*tasks)
await browser.close()
One browser, N contexts, each context's lifetime spans one URL. Cheaper than N browsers; isolation cleaner than N pages on one context.
Hands-on lab
Open /products and /products?page=2 from the same context, then again from two different contexts. Inspect context.cookies() after both runs. You should see that same-context cookies persist while cross-context cookies are isolated. That's the entire mental model in one experiment.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/productsQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.