Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

3.38intermediate5 min read

SERP-API-Specific Features: Async Searches, Search Archives, Location Lookups

Beyond the basic search call, providers ship features that change what's possible. Async batching, history archives, location helpers, and more.

What you’ll learn

  • Use async batch submission for high-volume workloads.
  • Query a provider's search archive for historical data.
  • Look up valid location strings via a location-discovery endpoint.
  • Take advantage of screenshots, HTML capture, and other provider extras.

The basic "submit a query, get JSON" is table stakes. Modern SERP-APIs ship additional features that change what kinds of scrapers you can build. This lesson tours the most useful ones.

Specifics vary by provider; the concepts are universal.

Async batch searches

Synchronous calls block until results arrive, 2–10 seconds per call. For 10k searches, that's 5–25 hours. Async batching:

  • Submit many queries at once via a /batch endpoint.
  • Provider returns an immediate batch_id.
  • Provider runs queries in parallel internally.
  • You poll /batch/{id}/status or get a webhook callback when done.
  • Download results via /batch/{id}/results.
def submit_batch(queries: list[dict]) -> str:
  r = requests.post(f"{API_URL}/batch", json={
  "searches": queries,
  "api_key": API_KEY,
  })
  return r.json()["batch_id"]

def wait_for_batch(batch_id: str) -> list[dict]:
  while True:
  status = requests.get(f"{API_URL}/batch/{batch_id}/status",
  params={"api_key": API_KEY}).json()
  if status["state"] == "completed":
  break
  time.sleep(5)
  return requests.get(f"{API_URL}/batch/{batch_id}/results",
  params={"api_key": API_KEY}).json()["searches"]

batch_id = submit_batch([
  {"q": "iphone 15", "gl": "us"},
  {"q": "samsung galaxy", "gl": "us"},
  # ... 10k more
])
results = wait_for_batch(batch_id)

Batches typically complete 10–100x faster than sequential. Sometimes priced slightly differently, confirm with your provider.

Search archives

Some providers persist every result indefinitely (or for N days/months) and let you re-fetch:

# Re-fetch a previous result by search ID
def get_archived(search_id: str) -> dict:
  return requests.get(f"{API_URL}/searches/{search_id}",
  params={"api_key": API_KEY}).json()

Why it matters:

  • Replay analyses. If you discover a new field you should have captured, re-read past results.
  • Audit / compliance. Prove your scraper saw X on date Y.
  • Cost optimization. Sometimes free or cheaper to re-fetch from archive than re-run the search.
  • Comparison studies. "What did this query look like 6 months ago?"

Whether your provider supports it varies. Some only retain for 14 days; some indefinitely (with archive-specific pricing).

Location lookup endpoints

Trial-and-error on location= strings is painful. Many providers expose:

def lookup_locations(query: str) -> list[dict]:
  return requests.get(f"{API_URL}/locations", params={
  "q": query,
  "api_key": API_KEY,
  }).json()

print(lookup_locations("Chicago"))
# → [{"name": "Chicago, IL, United States", "canonical_name": "...",
#  "google_id": "...", "country_code": "US"...}]

Use the canonical_name directly in subsequent search calls. Avoids typos and ensures the provider's internal location-resolution succeeds.

Screenshots and HTML capture

Beyond JSON, some providers offer:

  • Screenshot of the SERP, PNG/JPEG, useful for audits or report screenshots.
  • Full HTML, the raw page (sometimes JS-rendered). Useful when the JSON misses a niche feature.
r = requests.get(API_URL, params={
  "q": "iphone 15",
  "engine": "google",
  "api_key": API_KEY,
  "screenshot": "true",
  "html": "true",
})
data = r.json()
# data["screenshot_url"] = "https://...png"
# data["html"] = "<html>...</html>"

Premium features, additional cost.

Schema inspection

Some providers offer a self-documenting JSON schema endpoint:

schema = requests.get(f"{API_URL}/schema/google").json()
# Returns the expected shape, field types, descriptions

Useful for generating typed clients (Pydantic, attrs) or documentation.

Bulk export and webhooks

For pipelines that need ongoing data flow:

  • Webhooks, provider POSTs to your endpoint when batches complete.
  • S3 / GCS export, large batches dropped into your storage bucket.
  • Streaming endpoints, some providers offer SSE or similar for ongoing search streams.

Typical for enterprise tier; less common on starter plans.

Combined-engine searches

A few providers let you submit "search across multiple engines in one call":

data = requests.get(API_URL, params={
  "q": "best vpn",
  "engines": "google,bing,duckduckgo",  # combined
  "api_key": API_KEY,
}).json()

# data['searches'] = [{'engine': 'google', 'organic_results': ...}...]

Saves orchestration overhead but bills per engine.

Other useful extras

  • Cached results. Some providers cache for N minutes; cheap re-fetches return the same data.
  • include_html=true for SERP HTML if you want to do your own parsing.
  • safe_search parameter, filter explicit content.
  • device_user_agent, supply your own UA for ultra-specific device emulation.
  • uule precision, pass an exact location string.

Read your provider's docs end-to-end at least once. The features you don't know about are the ones you can't use.

When to use which feature

Need Feature
10k+ queries per run Async batches
Re-analyze old data Search archive
Hyper-precise location targeting Lat-lng or location lookup endpoint
Audit screenshots Screenshot capture
Custom parsing HTML capture
Continuous pipeline Webhooks + S3 export

A combined example

A nightly batch of 5k keywords with archive-backed re-analysis:

import requests, time

# 1. Submit batch
batch_id = requests.post(f"{API_URL}/batch", json={
  "searches": [{"q": kw, "gl": "us", "hl": "en"} for kw in load_keywords()],
  "api_key": API_KEY,
}).json()["batch_id"]

# 2. Wait
while True:
  s = requests.get(f"{API_URL}/batch/{batch_id}/status",
  params={"api_key": API_KEY}).json()
  if s["state"] == "completed": break
  time.sleep(15)

# 3. Process
results = requests.get(f"{API_URL}/batch/{batch_id}/results",
  params={"api_key": API_KEY}).json()["searches"]
for r in results:
  persist(r)

# 4. Later, re-fetch from archive to capture a new field
for search_id in stored_search_ids:
  data = requests.get(f"{API_URL}/searches/{search_id}",
  params={"api_key": API_KEY}).json()
  update_with_new_field(data)

Hands-on lab

Conceptual lesson, feature availability depends on your provider. Action: read your provider's API reference end-to-end. Note three features you haven't used. Try one of them on a small batch. Most teams only use 10-20% of what their provider offers; expanding even slightly can unlock significant capability.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

SERP-API-Specific Features: Async Searches, Search Archives, Location Lookups1 / 8

Why use async batch submission instead of sequential calls for 10,000 queries?

Score so far: 0 / 0