Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

2.10intermediate5 min read

Playwright in Node, Why You'd Choose It

Playwright's Node API is the original, the fastest-evolving, and the natural fit when the target site is itself JavaScript-heavy. Same concepts, async-first.

What you’ll learn

  • Install and run Playwright in a Node project.
  • Translate any Python Playwright script to its Node equivalent.
  • Recognise the four specific cases where Node Playwright is the right call.
  • Use ESM imports vs CommonJS require, and avoid the version pitfall.

Playwright was born in Node. The Python and PHP bindings came later and are essentially clients to the same underlying protocol. Everything you've learned applies, only the syntax changes. This lesson is about when Node is the better tool, not how to translate code.

Install

npm init -y
npm install playwright
npx playwright install chromium

playwright installs the library; npx playwright install chromium downloads the browser binary. Same two-step pattern as Python.

For projects that only ever target Chromium, you can use the lighter playwright-core package and bring your own browser binary, or playwright-chromium to skip the multi-browser install.

Your first scraper, side by side

Python:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
  browser = p.chromium.launch()
  page = browser.new_page()
  page.goto("https://practice.scrapingcentral.com/")
  print(page.title())
  browser.close()

Node (async only):

const { chromium } = require("playwright");

(async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto("https://practice.scrapingcentral.com/");
  console.log(await page.title());
  await browser.close();
})();

Differences are mechanical: import shape, newPage instead of new_page, and every Playwright call is async. Node has no sync API, there's no equivalent of Python's sync_playwright(). Embrace await from day one.

ESM vs CommonJS

// CommonJS
const { chromium } = require("playwright");

// ESM (in a .mjs file or "type": "module" in package.json)
import { chromium } from "playwright";

Both work. ESM is the modern default and plays nicer with TypeScript and modern bundlers; CommonJS works everywhere with no configuration. For a one-file scraper, CommonJS is the simplest start.

Four cases where Node Playwright is the right call

You can scrape anything in any language, but four situations favour Node:

1. The target site IS a Node app you can extend

Your team built the Next.js site. You want a scraper that imports the same React components, validates with the same Zod schemas, and writes to the same TypeScript types. Node Playwright means one type system across both ends.

2. Heavy use of page.evaluate

page.evaluate runs JS inside the browser. When you write a lot of in-browser logic, DOM walking, custom collectors, Node lets you use the same language inside and outside the browser:

const products = await page.evaluate(() => {
  return Array.from(document.querySelectorAll(".product-card")).map(card => ({
  name: card.querySelector("h2")?.innerText,
  price: card.querySelector(".price")?.innerText,
  url: card.querySelector("a")?.href,
  }));
});

In Python you'd write that same JS as a string. In Node it's a real function: syntax-checked, linted, refactorable.

3. You're already in a Node toolchain

Playwright integrates into Jest, Vitest, Mocha. CI pipelines for a Node project usually have npm install and npx playwright already wired. Adding a scraper to an existing repo is friction-free.

4. Real-time / streaming scraping

Node's event-driven model fits streaming well. WebSocket-heavy targets, Server-Sent Events, real-time price feeds, Node's EventEmitter and async iterator patterns map naturally to these. Python can do it too, but Node feels native.

The async-await pattern

Every Playwright Node call returns a Promise. The shape that works:

const { chromium } = require("playwright");

async function scrape(url) {
  const browser = await chromium.launch();
  try {
  const context = await browser.newContext();
  const page = await context.newPage();
  await page.goto(url, { waitUntil: "domcontentloaded" });
  return await page.title();
  } finally {
  await browser.close();
  }
}

(async () => {
  const title = await scrape("https://practice.scrapingcentral.com/");
  console.log(title);
})();

try/finally is the equivalent of Python's with block. It guarantees the browser closes even if goto throws.

Locators in Node

Same API, camelCase:

await page.locator(".product-card").click();
await page.getByRole("button", { name: "Add to cart" }).click();
await page.getByText("Sign in").click();
await page.getByTestId("submit").click();

const count = await page.locator(".product-card").count();
const text = await page.locator("h1").first().innerText();

Note first() is a method in Node, not a property like Python's .first. Same for last(), nth(i).

Concurrency without asyncio

Promises handle concurrency natively:

const urls = ["/products?page=1", "/products?page=2", "/products?page=3"];

const browser = await chromium.launch();
const context = await browser.newContext();

const results = await Promise.all(
  urls.map(async (path) => {
  const page = await context.newPage();
  await page.goto(`https://practice.scrapingcentral.com${path}`);
  const count = await page.locator(".product-card").count();
  await page.close();
  return { path, count };
  })
);

await browser.close();
console.log(results);

Promise.all runs the three scrapes in parallel inside one browser, one context, three pages. No external library, no asyncio plumbing. This is where Node Playwright shines.

TypeScript

If your project uses TypeScript, Playwright ships its types. No @types/playwright needed:

import { chromium, Page, BrowserContext } from "playwright";

async function scrapeProduct(page: Page, slug: string): Promise<{ name: string; price: string }> {
  await page.goto(`https://practice.scrapingcentral.com/products/${slug}`);
  return {
  name: await page.locator("h1").innerText(),
  price: await page.locator(".price").innerText(),
  };
}

You get autocomplete on page, type-checked return values, and refactor safety. For non-trivial scrapers, TypeScript pays for itself within an afternoon.

When Python is still the better choice

Don't switch to Node just for Playwright. Python wins for scrapers that:

  • Live inside a data-science toolchain (pandas, scikit, Airflow).
  • Need easy interop with PHP backends via Symfony Panther (Lesson 2.12) or async tasks via Celery.
  • Are part of a Scrapy pipeline (Sub-Path 5).

Choose the language that fits your project's data flow. Playwright is excellent in both.

Hands-on lab

Open /challenges/dynamic/spa-routed. Write a Node Playwright script that navigates the SPA, clicks through three internal routes, and collects the heading text on each. Run it. Then try the same with Promise.all opening three pages concurrently. Observe the speedup, that concurrency story is Node Playwright's defining strength.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/dynamic/spa-routed

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Playwright in Node, Why You'd Choose It1 / 8

Why is there no `sync` API in Node Playwright like there is in Python?

Score so far: 0 / 0