Decision Framework: Browser vs API vs SERP-API
Three tools, three cost models, three failure modes. Picking the wrong one is the single most expensive mistake in scraping. Here's the framework.
What you’ll learn
- Articulate when to use a browser, a direct API call, or a SERP-API provider.
- Estimate the dollar cost per record for each approach.
- Identify the failure modes that push you from one to another.
- Apply the framework to three concrete target types.
You have three primary tools for any scraping task:
- Direct API/HTTP,
requests,httpx, Guzzle. Cheapest, fastest, hardest to set up. - Headless browser, Playwright, Selenium, Puppeteer. Slowest, most expensive, easiest to set up.
- SERP-scraping API, SerpApi, Bright Data, ScraperAPI, ScrapingBee, etc. Most expensive per call, lowest engineering effort.
Most beginners use a browser for everything because it works. Most pros use direct API calls for everything they can, browser only when forced, and SERP-API for the narrow set of targets where it makes sense. This lesson is about how to pick.
The cost models, side by side
| Tool | Cost per 1k requests | Setup time | Typical use |
|---|---|---|---|
Direct API (requests) |
$0–$0.20 (proxies) | High (auth, headers) | Anything API-backed |
| Headless browser | $0.50–$5 (compute + proxies) | Low | JS-heavy, anti-bot, complex flows |
| SERP-API provider | $1–$10 (per the provider) | Very low | Google/Bing/Amazon results, geo-located |
That's a 10–50x cost spread. At 1 million records:
- Direct: ~$0–$200
- Browser: ~$500–$5,000
- SERP-API: ~$1,000–$10,000
Pick wrong and you've burned $10,000 you didn't need to spend.
The decision tree
Can you find the JSON XHR?
│
┌─────────────┴─────────────┐
│ YES │ NO
▼ ▼
Can you replicate the Is the page server-rendered
auth + headers? with no JS?
│ │
┌──────┴──────┐ ┌─────┴─────┐
│ YES │ NO │ YES │ NO
▼ ▼ ▼ ▼
Direct API Browser Direct HTTP Browser
+ capture + parse HTML
+ reverse-engineer
Special case: target is Google/Bing/Amazon search results
→ SERP-API regardless of the above
When to use a direct HTTP/API client
Use it when:
- The site has a discoverable XHR layer (use DevTools → Fetch/XHR).
- Auth is replicable: cookies, JWT, API key, or basic auth.
- The target isn't shielded by an active anti-bot (Cloudflare Turnstile, PerimeterX, DataDome) that interrogates TLS or JS execution.
That covers ~80% of scraping tasks in 2026. It's the default and the right default.
import requests
r = requests.get("https://practice.scrapingcentral.com/api/products")
data = r.json()
Five lines. Pennies per million requests.
When to use a headless browser
Use it when:
- The data is generated by client-side JS after a chain of interactions you can't easily replicate.
- The site uses an anti-bot that requires JS execution to set a cookie (
__cf_bm,_pxhd). - TLS / HTTP/2 / JA3 fingerprinting is enforced and you don't want to deal with
curl-cffi. - You need to render PDF/image canvases or extract from
<canvas>/<video>.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://example.com/dashboard")
page.wait_for_selector(".price")
print(page.content())
Slower (50ms → 2–10s per page), much more expensive (you're paying for a full Chromium process), but it gets the job done when direct HTTP fails.
When to use a SERP-API provider
This is the most misunderstood category. SERP-APIs aren't for general scraping, they're for one specific job: returning structured results from search engines and a few major marketplaces.
Use a SERP-API when:
- Your target is Google, Bing, DuckDuckGo, Yandex, Baidu, Naver, Brave Search, or YouTube.
- Or the major marketplace SERPs: Amazon, Walmart, eBay, Tripadvisor, Yelp, App Store.
- You need accurate geo-located results (specific city, specific language).
- You don't want to maintain the parsing, proxy rotation, CAPTCHA handling, and SERP feature drift yourself.
Don't use a SERP-API when:
- Your target is a regular e-commerce or content site. That's a $1–$10/1k cost when a $0.20/1k direct call would do.
- You're just curious about a few queries. Manual is fine.
- You need data the SERP-API doesn't structure (sometimes you have to scrape Google yourself for niche features).
Lessons 3.22–3.40 cover this category in depth, including evaluation frameworks for picking among providers.
Three concrete examples
Example 1, Scraping a competitor's product catalog (10k SKUs):
- Layer check: Network shows
/api/products?page=N. JSON returned. - Auth: public, no key needed.
- Anti-bot: none active.
- Decision: direct HTTP. Cost: ~$0 (no proxies needed). Time: 1 day.
Example 2, Scraping LinkedIn job postings:
- Layer check: JSON exists but auth is signed; anti-bot is heavy.
- Anti-bot: yes, LinkedIn fingerprints aggressively.
- Decision: either headless browser with residential proxies (~$3/1k), OR a specialized LinkedIn API provider. Probably the API provider if budget allows.
Example 3, Tracking Google rankings for 10k keywords daily:
- Target is Google itself.
- Decision: SERP-API. No serious engineer scrapes Google directly in 2026, the cost of maintaining a Google scraper (proxies, residential IPs, CAPTCHA solvers, parser drift) exceeds the cost of a SERP-API subscription within weeks.
The progression in practice
Senior scrapers often combine all three:
- Discovery phase: open the target in a headless browser, capture all XHRs (Sub-Path 3, lesson 47-style mitmproxy or just DevTools).
- Production phase: direct HTTP against the discovered XHR, with the captured auth replicated.
- Cover the gaps: use a SERP-API for the search-engine-shaped slice of the workload, browser fallback for the 5% of targets that won't budge.
That's the framework. The next 47 lessons fill in the how.
Hands-on lab
No lab for this conceptual lesson. Instead, pick a real site you've wanted to scrape and walk through the decision tree on paper. Open it, check Network → Fetch/XHR. Estimate cost-per-record for each of the three approaches. Write down which you'd pick and why. Compare your reasoning to the framework above, that gap, when it exists, is the gap the rest of this sub-path is closing.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.