Waiting Strategies (The Make-or-Break Skill), Dynamic Web & Browser Automation

Time-based sleeps are the #1 cause of flaky scrapers. Replace them with the four deterministic wait primitives Playwright provides.

If you only learn one skill from this sub-path, learn this: never use time.sleep() in a scraper. Sleeps are guesses, and guesses are the root cause of every flaky scraper in production. Playwright gives you four deterministic wait primitives, once you know which to reach for, your scrapers stop breaking on slow networks.

Why `time.sleep` is wrong

page.click("button.load-more")
time.sleep(2)  # hope it's loaded
data = page.locator(".item").all()

Three problems:

Too short. On a slow network, 2 seconds isn't enough. You get partial data and don't know.
Too long. On a fast network, you waste 1.9 seconds per click. Over thousands of pages, that's hours.
Wrong signal. Time doesn't tell you whether the content arrived. You're synchronising on the wall clock, not on the actual event.

Every Playwright wait method waits for something specific and returns as soon as it happens. That's the whole game.

The five wait primitives

Primitive	Synchronises on	Use when
Auto-wait inside actions	Element actionability	Almost always, built into click, fill, etc.
`wait_for_selector`	DOM state of a selector	Element should appear/disappear/change state
`wait_for_load_state`	Page lifecycle event	Cross-cutting page-level event
`expect_response` / `expect_request`	Network event matching a URL	Data arrives via XHR/Fetch
`wait_for_function`	Arbitrary JS predicate	Custom condition, count of items, value of a variable

Together they cover every wait you'll need.

wait_for_selector

The most common after auto-wait. Wait for an element to reach a specific state:

page.wait_for_selector(".product-card", state="visible", timeout=10000)
page.wait_for_selector(".spinner", state="hidden")  # wait until spinner is gone
page.wait_for_selector(".error-banner", state="attached")  # exists in DOM, may not be visible
page.wait_for_selector(".tooltip", state="detached")  # removed from DOM

States:

attached, in the DOM (default).
detached, not in the DOM.
visible, in the DOM and visible.
hidden, either not in the DOM or hidden.

state="hidden" is gold for spinner-watching: wait until the loader is gone, then read.

wait_for_load_state

Page-lifecycle synchronisation:

page.wait_for_load_state("domcontentloaded")
page.wait_for_load_state("load")
page.wait_for_load_state("networkidle")

domcontentloaded, HTML parsed; subresources may still be loading.
load, load event fired; main subresources loaded.
networkidle, 500ms with no network activity.

networkidle is convenient but flaky on sites with analytics beacons, long-poll WebSockets, or live-update streams. Prefer domcontentloaded + a specific selector wait.

expect_response

The most powerful wait when data arrives via XHR/Fetch:

with page.expect_response("**/api/products*") as resp_info:
  page.click("text=Load more")
response = resp_info.value
data = response.json()

You declare what you're waiting for (a response URL pattern) and Playwright captures it. The with block runs your trigger action; the response is available after exit.

Variants:

# Match by predicate function
with page.expect_response(lambda r: r.url.endswith("/products") and r.status == 200):
  page.click("text=Load more")

# Wait for a request even if no response yet
with page.expect_request("**/api/checkout"):
  page.click("text=Buy")

Use this whenever a UI action triggers a known API call. It's deterministic, the wait returns the instant the response arrives, with the response payload in hand.

wait_for_function

Custom predicates that run in the browser:

page.wait_for_function("() => document.querySelectorAll('.product-card').length >= 24")
page.wait_for_function("() => window.__APP_READY__ === true")
page.wait_for_function("(target) => document.title === target", arg="Catalog108 – Products")

wait_for_function polls the predicate inside the browser context. Returns when truthy. Useful for:

Waiting on a global state flag the app sets.
Waiting for N items to load (not just one).
Waiting on conditions that don't map cleanly to a single selector.

Race patterns

Sometimes you don't know which signal will fire first, success or error:

import asyncio

async def main():
  page = ...
  await page.click("button.submit")
  result = await page.wait_for_selector(".success.error")
  if "success" in (await result.get_attribute("class") or ""):
  print("ok")
  else:
  print("failed")

A comma in a CSS selector is an OR, Playwright's wait_for_selector returns when either appears. Cheaper than racing two wait_for calls in parallel.

Tuning timeouts: fail fast

Default timeout is 30 seconds. For most scraping, that's too long:

context.set_default_timeout(8000)  # 8s for everything
context.set_default_navigation_timeout(15000)

A 30-second wait usually means the scrape is broken, slow networks rarely take that long. Set 5–10s, fail fast, retry the whole page. Better than hanging on a 28-second timeout.

Per-call overrides for legitimate slow cases:

page.wait_for_selector(".big-export", timeout=60000)  # this one is genuinely slow

The pattern: act → wait → assert

Every interaction follows this shape:

page.click("button.load-more")
page.wait_for_selector(".product-card:nth-of-type(48)")
count = page.locator(".product-card").count()
assert count == 48

Act. Click, fill, navigate.
Wait. Synchronise on the specific change you expect.
Assert. Verify the change actually happened.

The assert is non-optional. Otherwise a silent failure (wrong selector, network timeout swallowed somewhere) goes unnoticed and your scrape ships bad data.

A small but important warning

page.wait_for_timeout(ms) does exist in the API. It is a literal setTimeout. Don't use it. It's there for debugging only. If you find yourself reaching for it, ask which actual signal you should be waiting for instead.

Hands-on lab

Open /challenges/dynamic/auto-typed/animated. The page types text character-by-character via JS. Write three versions of a scraper that captures the final text: (1) with time.sleep(5), (2) with wait_for_function checking the text length, (3) with wait_for_selector watching for a "done" indicator. Time all three. The function-based version should win on both speed and reliability.

Waiting Strategies (The Make-or-Break Skill)

What you’ll learn

Why `time.sleep` is wrong

The five wait primitives

wait_for_selector

wait_for_load_state

expect_response

wait_for_function

Race patterns

Tuning timeouts: fail fast

The pattern: act → wait → assert

A small but important warning

Hands-on lab

Hands-on lab

Quiz, check your understanding

What is the main reason `time.sleep(2)` should NEVER appear in a production scraper?

Waiting Strategies (The Make-or-Break Skill)

What you’ll learn

Why time.sleep is wrong

The five wait primitives

wait_for_selector

wait_for_load_state

expect_response

wait_for_function

Race patterns

Tuning timeouts: fail fast

The pattern: act → wait → assert

A small but important warning

Hands-on lab

Hands-on lab

Quiz, check your understanding

What is the main reason `time.sleep(2)` should NEVER appear in a production scraper?

Why `time.sleep` is wrong