Locators: The Auto-Waiting Magic
Locators are Playwright's most important abstraction. They auto-wait, auto-retry, and eliminate the entire category of timing bugs that plague Selenium.
What you’ll learn
- Explain what a locator is and how it differs from a one-shot element handle.
- List the conditions a locator auto-waits for before acting (attached, visible, enabled, stable).
- Use chained locators to scope queries reliably.
- Recognise when auto-waiting is NOT enough, and which explicit waits to add.
The single biggest reason Playwright eats Selenium for lunch is the locator. A locator isn't a reference to a DOM element, it's a lazy query that re-runs every time you act on it, with automatic waiting baked in. Once you internalise this, most of your timing bugs disappear.
Locator vs ElementHandle
Selenium has find_element(). It returns a live reference to a DOM node as it existed at that moment. If the React framework re-renders five milliseconds later, your reference is stale and throws StaleElementReferenceException. You spend half your Selenium career writing retry loops around this.
Playwright's locator(selector) returns no element. It returns a description of how to find an element. Every action on a locator re-queries the DOM, so re-renders don't break you:
button = page.locator("button.submit")
button.click() # queries the DOM right now, finds the button, clicks
# ...React tears down and rebuilds the button...
button.click() # re-queries from scratch, finds the new button, clicks
There's still an ElementHandle API (page.query_selector(...)) for when you need a frozen reference, for example, to use the same node across multiple frames of an animation. But for 95% of scraping work, locators are what you want.
What auto-waiting actually waits for
When you call locator.click(), Playwright waits, by default up to 30 seconds, for the element to satisfy actionability checks:
| Check | Meaning |
|---|---|
| Attached | The node exists in the DOM. |
| Visible | Non-empty bounding box, not display:none, not visibility:hidden. |
| Stable | Hasn't moved or resized in the last two animation frames. |
| Receives events | No other element on top intercepting clicks. |
| Enabled | Not disabled (for form controls). |
Only after all five are true does the click actually fire. This eliminates 90% of timing bugs without any explicit wait. No more time.sleep(2). No more "click failed because the button was still animating."
Different actions wait for different subsets:
click()/dblclick(), all five checks.fill(), attached + visible + enabled + editable.inner_text()/text_content(), attached only (no visibility check).is_visible()/is_enabled(), attached only; returns the current state.
Chained locators
Locators chain. Each call narrows the scope:
# Find the .product-card containing the text "Mug", then click its .add-to-cart button
page.locator(".product-card").filter(has_text="Mug").locator(".add-to-cart").click()
This reads like English. You can build deeply scoped selectors without writing fragile compound CSS.
Useful chaining methods:
| Method | What it does |
|---|---|
.locator(selector) |
Scope further: find selector within the current match. |
.filter(has_text="...") |
Keep only matches whose text content contains the string. |
.filter(has=other_locator) |
Keep only matches that contain other_locator. |
.first / .last / .nth(i) |
Pick one of N matches. |
.all() |
Materialise the matches into a list of locators. |
filter is especially useful. The pattern "the row where the name column says X, click the delete button in that row" is two locators and a .filter().
Built-in selectors
Beyond raw CSS and XPath, Playwright understands a handful of high-level matchers:
page.get_by_role("button", name="Submit") # ARIA role + accessible name
page.get_by_text("Add to cart") # visible text content
page.get_by_label("Email") # form field by its label
page.get_by_placeholder("Search...") # input placeholder
page.get_by_alt_text("Logo") # image alt text
page.get_by_title("Help") # element title attribute
page.get_by_test_id("submit-btn") # data-testid attribute
These are the recommended default selectors for stability. They survive cosmetic redesigns: a class name might change, but the ARIA role of a "Submit" button doesn't. We dig into the full strategy in Lesson 2.7.
Auto-waiting is NOT enough
Don't rely on auto-wait as your only synchronization. Three cases need explicit waits:
1. Waiting for content that appears via XHR.
page.goto("https://practice.scrapingcentral.com/challenges/dynamic/lazy-images")
page.wait_for_selector(".product-card img[src]:not([src^='data:'])")
The <img> tags exist immediately (auto-wait would succeed), but their src is a placeholder until the lazy-load fires. You need to wait for the attribute condition, not the element.
2. Waiting for a network response.
with page.expect_response("**/api/products*") as resp:
page.click("text=Load more")
response = resp.value
data = response.json()
Auto-wait doesn't know about your custom API call; expressing the wait in terms of the network is more reliable than guessing about DOM updates.
3. Waiting for a function to return true.
page.wait_for_function("() => document.querySelectorAll('.product-card').length >= 20")
The DOM-side equivalent of a custom condition. Useful for "wait until at least 20 products have loaded."
Timeouts: be specific
Default timeout is 30 seconds. Override per-call:
page.locator(".product-card").click(timeout=5000)
page.wait_for_selector(".loaded", timeout=10000, state="visible")
Or per-context:
context.set_default_timeout(10000) # 10s for everything in this context
Production scrapers should set this lower than 30s, a 30-second wait usually means something is broken, and you want to fail fast and retry cleanly.
The "strict mode" violation
Playwright errors loudly if a locator matches multiple elements when you expected one:
Error: locator.click: strict mode violation:
page.locator(".btn") resolved to 3 elements
This is a feature, not a bug. It catches selectors that are accidentally too broad. Fix it by adding more specificity (.first, a filter, a closer container), never by ignoring the warning.
Hands-on lab
Open /challenges/dynamic/lazy-images. Write a script that opens the page and counts how many product images have a non-placeholder src. Compare two approaches: (a) call count() immediately after goto(), and (b) use wait_for_selector(".product-card img:not([src^='data:'])") first. You should see (a) return zero or a small number and (b) return all of them. That's auto-waiting hitting its limit, the image element exists immediately, but its attribute doesn't.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/challenges/dynamic/lazy-imagesQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.