Sub-path 3 of 6
Dynamic Web & Browser Automation
When static fails, drive a real browser.
For JS-rendered sites, SPAs, infinite scroll, modals, iframes, Shadow DOM. Playwright is the main tool, with Selenium and Puppeteer for completeness. Each lesson runs against the dynamic challenges at Catalog108.
~4 weeks part-time · 30 lessons
Lessons
- 2.1intermediate
Detecting JS-Rendered Content (Three-Test Diagnostic)
Before reaching for a headless browser, run three fast tests to confirm the page actually needs one. Most pages don't.
Lab:
/challenges/dynamic/spa-pure - 2.2intermediate
Client-Side Rendering vs SSR vs Hybrid
The architectural difference between rendering modes determines whether you scrape with curl, parse a hydration payload, or drive a browser.
Lab:
/products - 2.3intermediate
When Browser Automation Is the Right Tool (And When It Isn't)
A decision framework for choosing between HTTP scraping, API hunting, and headless browsers, with the honest trade-offs.
- 2.4intermediate
Playwright Install + First Script (Python)
Install Playwright, drive a real browser, screenshot a page, extract text, the minimum viable browser-automation pipeline.
Lab:
/ - 2.5intermediate
Browser, Context, Page, The Mental Model
Three nested objects define every Playwright script. Get the relationship right and concurrency, isolation, and sessions all become obvious.
Lab:
/products - 2.6intermediate
Locators: The Auto-Waiting Magic
Locators are Playwright's most important abstraction. They auto-wait, auto-retry, and eliminate the entire category of timing bugs that plague Selenium.
Lab:
/challenges/dynamic/lazy-images - 2.7intermediate
Locator Strategies: CSS, XPath, Role, Text, Test-ID
Choosing the right selector type is the single biggest factor in scraper stability. A clear hierarchy of which to prefer, when, and why.
Lab:
/products/1-white-wooden-vase - 2.8intermediate
Actions: click, fill, hover, type, drag
The verbs of browser automation. Each action has subtle options that change behaviour, knowing them is the difference between flaky and rock-solid scrapers.
Lab:
/challenges/dynamic/click-required/reveal - 2.9intermediate
Waiting Strategies (The Make-or-Break Skill)
Time-based sleeps are the #1 cause of flaky scrapers. Replace them with the four deterministic wait primitives Playwright provides.
Lab:
/challenges/dynamic/auto-typed/animated - 2.10intermediate
Playwright in Node, Why You'd Choose It
Playwright's Node API is the original, the fastest-evolving, and the natural fit when the target site is itself JavaScript-heavy. Same concepts, async-first.
Lab:
/challenges/dynamic/spa-routed - 2.11intermediate
Async/Await Patterns in Node Scrapers
Async is the model, but the wrong patterns leak browsers, deadlock on shared state, and silently swallow errors. Four idioms to internalise.
Lab:
/challenges/dynamic/heavy-dom/10k-items - 2.12intermediate
Symfony Panther, Playwright/ChromeDriver for PHP
PHP's first-class browser automation library. Same model as Playwright, friendly Symfony integration, real browser control.
Lab:
/challenges/dynamic/spa-pure - 2.13intermediate
Building a Headless Scraper as a Symfony Console Command
Wrap Panther in a Symfony Console command for cron-friendly, configurable, observable PHP scrapers.
Lab:
/challenges/dynamic/infinite-scroll/button-jsappend - 2.14intermediate
Selenium in Python (Legacy but Still Common)
Selenium predates Playwright by a decade and still dominates legacy codebases. Know it well enough to read, port, and not fear inheriting a Selenium scraper.
Lab:
/challenges/dynamic/date-picker/custom - 2.15intermediate
Selenium in PHP via `php-webdriver/webdriver`
Selenium for PHP. Maintained, W3C-compliant, the right tool when Panther doesn't fit or you need raw WebDriver control.
Lab:
/challenges/dynamic/date-picker/custom - 2.16intermediate
Puppeteer in Node.js
Google's own browser-automation library, the Chromium-only ancestor of Playwright. Smaller, simpler, and still excellent for Chrome-specific scrapes.
Lab:
/challenges/dynamic/drag-drop/list-reorder - 2.17intermediate
Choosing Between Playwright, Selenium, and Puppeteer
A working framework for picking the right browser automation tool for your project, and when the answer is 'don't use any of them.'
- 2.18intermediate
Infinite Scroll, Five Implementation Patterns
Every infinite scroll falls into one of five patterns. Identify the pattern first, then pick the right scraping technique, or find the underlying API and skip browser automation entirely.
Lab:
/challenges/dynamic/infinite-scroll/intersection - 2.19intermediate
Lazy-Loaded Images and Skeleton Loaders
Images that appear blank, skeleton placeholders that fool naive scrapers, and the right way to wait for actual content.
Lab:
/challenges/dynamic/lazy-images - 2.20intermediate
Modals, Popups, Cookie Banners, Auto-Dismissing
Every modern site throws three to five overlays at your scraper before you reach the content. Recognise them, dismiss them, ignore them, without breaking the scrape.
Lab:
/challenges/dynamic/modals/cookie-banner - 2.21intermediate
iframes and Shadow DOM, Piercing Nested Contexts
Two ways content can hide from a flat document.querySelectorAll. Pierce them correctly and you can scrape anything; pierce them wrong and you'll wonder why your selectors return nothing.
Lab:
/challenges/dynamic/iframe/same-origin - 2.22intermediate
Drag-and-Drop, Date Pickers, Complex Form Controls
The form controls that look custom because they are. Three patterns, drag-drop, custom date pickers, and rich select widgets, and how to drive them reliably.
Lab:
/events - 2.23intermediate
Capturing XHR / Fetch Calls the Page Makes
The defining browser-automation pattern: drive the page just enough to discover the underlying API, then bypass the browser entirely. This is how production scrapers get fast.
Lab:
/locations - 2.24intermediate
Blocking Resources for 3–5x Speedup
Most page weight is images, fonts, ads, and analytics. Blocking them at the browser level slashes scrape time without losing the data you actually want.
Lab:
/products - 2.25intermediate
Persistent Contexts and Browser Profiles
Save a logged-in session once, replay it forever. The pattern that turns five-minute auth flows into 50-millisecond cookie injections.
Lab:
/account/dashboard - 2.26advanced
Browser Pool Patterns for Concurrency
Running one browser at a time is wasteful. Running 50 is a memory disaster. The right pattern: a bounded pool of contexts under a shared browser process.
Lab:
/products - 2.27advanced
How Sites Detect Headless Browsers
Forty signals that distinguish a Playwright/Selenium browser from a real one. Knowing them is the prerequisite to evading them.
Lab:
/challenges/antibot/webdriver-detected - 2.28advanced
`playwright-stealth` and `undetected-chromedriver`
Two community-maintained toolkits that automate the dozens of fingerprint patches a stealth scraper needs. Install, configure, verify, then stop reinventing.
Lab:
/challenges/antibot/canvas-fingerprint - 2.29advanced
Camoufox and Other Patched Browsers
When JS-level stealth isn't enough, the next step is a browser whose binary itself has been patched to forge canvas, WebGL, and font fingerprints.
Lab:
/challenges/antibot/canvas-fingerprint - 2.30advanced
Mobile Emulation and Geolocation Spoofing
Many sites serve different content to mobile users, geo-target by IP and JS, and gate features by user-agent. Emulating these correctly opens scrapes you'd otherwise be locked out of.
Lab:
/locations
Every lesson has a hands-on lab target on Catalog108 , our first-party practice scraping sandbox. Each lab page has a /grade endpoint that returns pass/fail on your scraper output.