Decision Framework: Browser vs API vs SERP-API, APIs, SERPs & Reverse Engineering

Three tools, three cost models, three failure modes. Picking the wrong one is the single most expensive mistake in scraping. Here's the framework.

You have three primary tools for any scraping task:

Direct API/HTTP, requests, httpx, Guzzle. Cheapest, fastest, hardest to set up.
Headless browser, Playwright, Selenium, Puppeteer. Slowest, most expensive, easiest to set up.
SERP-scraping API, SerpApi, Bright Data, ScraperAPI, ScrapingBee, etc. Most expensive per call, lowest engineering effort.

Most beginners use a browser for everything because it works. Most pros use direct API calls for everything they can, browser only when forced, and SERP-API for the narrow set of targets where it makes sense. This lesson is about how to pick.

The cost models, side by side

Tool	Cost per 1k requests	Setup time	Typical use
Direct API (`requests`)	$0–$0.20 (proxies)	High (auth, headers)	Anything API-backed
Headless browser	$0.50–$5 (compute + proxies)	Low	JS-heavy, anti-bot, complex flows
SERP-API provider	$1–$10 (per the provider)	Very low	Google/Bing/Amazon results, geo-located

That's a 10–50x cost spread. At 1 million records:

Direct: ~$0–$200
Browser: ~$500–$5,000
SERP-API: ~$1,000–$10,000

Pick wrong and you've burned $10,000 you didn't need to spend.

The decision tree

  Can you find the JSON XHR?
  │
  ┌─────────────┴─────────────┐
  │ YES  │ NO
  ▼  ▼
  Can you replicate the  Is the page server-rendered
  auth + headers?  with no JS?
  │  │
  ┌──────┴──────┐  ┌─────┴─────┐
  │ YES  │ NO  │ YES  │ NO
  ▼  ▼  ▼  ▼
 Direct API  Browser  Direct HTTP  Browser
  + capture  + parse HTML
  + reverse-engineer

Special case: target is Google/Bing/Amazon search results
  → SERP-API regardless of the above

When to use a direct HTTP/API client

Use it when:

The site has a discoverable XHR layer (use DevTools → Fetch/XHR).
Auth is replicable: cookies, JWT, API key, or basic auth.
The target isn't shielded by an active anti-bot (Cloudflare Turnstile, PerimeterX, DataDome) that interrogates TLS or JS execution.

That covers ~80% of scraping tasks in 2026. It's the default and the right default.

import requests
r = requests.get("https://practice.scrapingcentral.com/api/products")
data = r.json()

Five lines. Pennies per million requests.

When to use a headless browser

Use it when:

The data is generated by client-side JS after a chain of interactions you can't easily replicate.
The site uses an anti-bot that requires JS execution to set a cookie (__cf_bm, _pxhd).
TLS / HTTP/2 / JA3 fingerprinting is enforced and you don't want to deal with curl-cffi.
You need to render PDF/image canvases or extract from <canvas>/<video>.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
  browser = p.chromium.launch()
  page = browser.new_page()
  page.goto("https://example.com/dashboard")
  page.wait_for_selector(".price")
  print(page.content())

Slower (50ms → 2–10s per page), much more expensive (you're paying for a full Chromium process), but it gets the job done when direct HTTP fails.

When to use a SERP-API provider

This is the most misunderstood category. SERP-APIs aren't for general scraping, they're for one specific job: returning structured results from search engines and a few major marketplaces.

Use a SERP-API when:

Your target is Google, Bing, DuckDuckGo, Yandex, Baidu, Naver, Brave Search, or YouTube.
Or the major marketplace SERPs: Amazon, Walmart, eBay, Tripadvisor, Yelp, App Store.
You need accurate geo-located results (specific city, specific language).
You don't want to maintain the parsing, proxy rotation, CAPTCHA handling, and SERP feature drift yourself.

Don't use a SERP-API when:

Your target is a regular e-commerce or content site. That's a $1–$10/1k cost when a $0.20/1k direct call would do.
You're just curious about a few queries. Manual is fine.
You need data the SERP-API doesn't structure (sometimes you have to scrape Google yourself for niche features).

Lessons 3.22–3.40 cover this category in depth, including evaluation frameworks for picking among providers.

Three concrete examples

Example 1, Scraping a competitor's product catalog (10k SKUs):

Layer check: Network shows /api/products?page=N. JSON returned.
Auth: public, no key needed.
Anti-bot: none active.
Decision: direct HTTP. Cost: ~$0 (no proxies needed). Time: 1 day.

Example 2, Scraping LinkedIn job postings:

Layer check: JSON exists but auth is signed; anti-bot is heavy.
Anti-bot: yes, LinkedIn fingerprints aggressively.
Decision: either headless browser with residential proxies (~$3/1k), OR a specialized LinkedIn API provider. Probably the API provider if budget allows.

Example 3, Tracking Google rankings for 10k keywords daily:

Target is Google itself.
Decision: SERP-API. No serious engineer scrapes Google directly in 2026, the cost of maintaining a Google scraper (proxies, residential IPs, CAPTCHA solvers, parser drift) exceeds the cost of a SERP-API subscription within weeks.

The progression in practice

Senior scrapers often combine all three:

Discovery phase: open the target in a headless browser, capture all XHRs (Sub-Path 3, lesson 47-style mitmproxy or just DevTools).
Production phase: direct HTTP against the discovered XHR, with the captured auth replicated.
Cover the gaps: use a SERP-API for the search-engine-shaped slice of the workload, browser fallback for the 5% of targets that won't budge.

That's the framework. The next 47 lessons fill in the how.

Hands-on lab

No lab for this conceptual lesson. Instead, pick a real site you've wanted to scrape and walk through the decision tree on paper. Open it, check Network → Fetch/XHR. Estimate cost-per-record for each of the three approaches. Write down which you'd pick and why. Compare your reasoning to the framework above, that gap, when it exists, is the gap the rest of this sub-path is closing.

Decision Framework: Browser vs API vs SERP-API

What you’ll learn