Detecting JS-Rendered Content (Three-Test Diagnostic), Dynamic Web & Browser Automation

Before reaching for a headless browser, run three fast tests to confirm the page actually needs one. Most pages don't.

The most expensive mistake in scraping is reaching for a headless browser when you didn't need one. Browser automation is ten to fifty times slower than HTTP scraping, uses ten times the memory, and breaks more often. Before you spin up Playwright, prove the page actually needs it.

Three tests, two minutes, no code.

Test 1: view-source vs Elements

In your browser, open the target page. Right-click and pick View Page Source (Ctrl+U / Cmd+Opt+U). This is the raw HTML the server sent, exactly what curl would get. Now open DevTools → Elements. This is the live DOM after JavaScript has run.

Search both for a string you want to scrape (a product name, a price, a date). Four possible outcomes:

view-source	Elements	Verdict
Present	Present	Server-rendered. Use `requests` / Guzzle.
Absent	Present	Client-rendered. You need a browser or the underlying API.
Present in JSON `<script>`	Present	Hydration payload. Parse the JSON directly, fastest path.
Absent	Absent (yet)	Lazy-loaded on scroll/click. Browser or API.

The hydration-payload case is gold. Frameworks like Next.js, Nuxt, and SvelteKit embed all initial data inside a <script id="__NEXT_DATA__"> tag. You can extract it with requests plus a JSON parse, no browser needed. Lesson 1.18 covers the technique in detail.

Test 2: disable JavaScript and reload

In Chrome DevTools, open the command palette (Ctrl+Shift+P / Cmd+Shift+P), type "Disable JavaScript", and reload. Three possible outcomes:

The page looks identical. Server-rendered. Static scraping will work.
A skeleton or spinner shows forever. Pure SPA. You need a browser or the API.
The shell loads but data is missing. Hybrid. The shell is server-rendered, data fetches via XHR. Often you can hit the XHR endpoint directly.

Re-enable JavaScript when you're done, DevTools doesn't always undo it cleanly on close.

Test 3: curl vs browser

The definitive test. Reproduce what your scraper would actually see:

curl -s -A "Mozilla/5.0" https://practice.scrapingcentral.com/challenges/dynamic/spa-pure \
  | grep -i "product"

If the data you want appears in the output, static scraping works. If you see only a <div id="root"></div> and a bundle of JS files, you've confirmed a client-rendered SPA.

Add -D headers.txt to capture response headers, sometimes the server returns SSR for browser User-Agents and a stripped shell for curl/8.0. The User-Agent flag handles the obvious case; if results still differ, you're looking at fingerprinting (Sub-Path 5).

The four rendering shapes

After running all three tests on a few hundred sites, every page falls into one of four buckets:

Pure SSR. Server returns full HTML with data embedded. curl is enough. Examples: Wikipedia, most government sites, classic e-commerce.
Pure CSR. Server returns a minimal shell; React/Vue/Svelte renders everything client-side. Examples: most dashboards, modern Twitter, the /spa-pure lab.
Hybrid (SSR + hydration). Server returns rendered HTML plus the data as JSON in a <script> tag. The framework "hydrates" the static markup into an interactive app. Examples: most Next.js / Nuxt sites.
Progressive enhancement. Server returns working HTML; JS adds nice-to-haves like sorting and filters. Static scraping works for the data; only interactions need a browser.

Knowing which bucket you're in determines the tool. Don't skip this step.

Decision tree

┌─ Data in view-source? ─── YES ──► Use requests/Guzzle. Done.
│  │
│  NO
│  ▼
├─ Data in a hydration script tag (__NEXT_DATA__, __NUXT__)? ── YES ──► Parse the JSON.
│  │
│  NO
│  ▼
├─ Data visible in DevTools Network as an XHR/Fetch response? ── YES ──► Hit that API.
│  │
│  NO
│  ▼
└─ Use a headless browser (Playwright/Selenium).

In practice you fall through to the bottom of this tree maybe twenty percent of the time. The other eighty percent is solvable without a browser, you just need to look first.

Why scrapers reach for browsers too quickly

Two reasons, both psychological:

The browser "just works". Throw Playwright at any page and it returns the rendered DOM. No reasoning required, no curl-debugging. The cost (slowness, resource use, fragility) shows up later, in production.
The page "looks dynamic". A spinner, a lazy-load fade, an animated counter, none of these necessarily mean the data is client-rendered. They might be cosmetic. Run the tests.

The instinct to default to a browser is the single most common reason scrapers are slow, expensive to run, and unstable. Diagnose first.

Hands-on lab

Open /challenges/dynamic/spa-pure in your browser. Run all three tests. Then open /products (a hybrid page) and run the same three tests. Compare what you see in view-source and in Network. You should be able to articulate, in one sentence, why one needs a browser and the other doesn't.

Detecting JS-Rendered Content (Three-Test Diagnostic)

What you’ll learn

Test 1: view-source vs Elements

Test 2: disable JavaScript and reload

Test 3: curl vs browser

The four rendering shapes

Decision tree

Why scrapers reach for browsers too quickly

Hands-on lab

Hands-on lab

Quiz, check your understanding

You search for a product name in View Page Source and find it. You also see it in the Elements panel. What kind of rendering is the page using?