Cost Optimization: Caching, Result Reuse, Selective Field Extraction, APIs, SERPs & Reverse Engineering

At $1–$5 per 1k calls, every redundant search is real money. The four mechanical patterns that cut a SERP-API bill in half.

A 10k-keyword daily SEO platform tracking 5 locales pays for 50k searches/day, ~1.5M/month, ~$3k-$15k/month depending on provider. With sensible cost optimization, that's typically halved.

This lesson is the menu of optimizations.

Optimization 1, TTL cache

Don't re-fetch the same query twice within its useful freshness window. SEO rank typically changes hour-to-day, not minute-to-minute.

import redis, json, hashlib

r = redis.Redis()

def cache_key(q, gl, hl, device):
  s = f"{q}|{gl}|{hl}|{device}"
  return f"serp:{hashlib.sha1(s.encode()).hexdigest()}"

def cached_search(q, gl="us", hl="en", device="desktop", ttl_seconds=3600):
  key = cache_key(q, gl, hl, device)
  if cached := r.get(key):
  return json.loads(cached)
  data = serp_api.search(q, gl=gl, hl=hl, device=device)
  r.setex(key, ttl_seconds, json.dumps(data))
  return data

Cache TTL by use case:

Real-time dashboards → 5-15 min cache.
Daily rank tracking → 12-24 hour cache (or no cache; you fetch once anyway).
Hourly monitoring → 30-60 min cache.

The cache key MUST include all parameters that change results: query, gl, hl, device, location. Missing one creates silent corruption.

Optimization 2, De-duplication of nearby queries

Sometimes the same data answers slightly different queries:

"plumber in Chicago" vs "plumber Chicago IL"
"iPhone 15 reviews" vs "reviews iPhone 15"

If your scraper generates such pairs from user input, normalize before scraping:

def canonicalize_query(q: str) -> str:
  return " ".join(q.lower().strip().split())

def search_with_dedup(raw_q, **kw):
  q = canonicalize_query(raw_q)
  return cached_search(q, **kw)

You lose some intent fidelity (a real SEO might track "plumber chicago il" vs "plumber in chicago" as separate intents). For most use cases, de-dup wins.

Optimization 3, Archive re-fetches

If your provider offers an archive, re-fetch from there for repeat reads:

A user wants to compare rank Jan vs Jun. You scraped Jan; rather than re-pay to fetch a fresh Jan, query the archive.
A new normalizer added a field. Re-run normalization over archive, not the API.

Archives often free or cheaper than new searches. Confirm with your provider.

Optimization 4, Selective field extraction

When the provider supports it, request only fields you use:

data = requests.get(API_URL, params={
  "q": q,
  "api_key": API_KEY,
  "include_fields": "organic_results.position,organic_results.link,organic_results.title",
})

Two benefits:

Smaller responses → faster network → lower latency.
Some providers price by response size or feature inclusion, you save on premium-feature billing.

Don't trim too aggressively: dropping position is usually wrong even if it looks redundant.

Optimization 5, Skip the search entirely with snapshotting

For monitoring KPIs (rank, share-of-voice), you don't always need fresh data:

Daily rank tracking → fetch once at night, serve dashboards from the database all day.
Weekly trend reports → cache the previous week's snapshot; refresh on schedule.

A real-time dashboard that hits the API on every page load burns money. Snapshot to your database; serve dashboards from the snapshot.

Optimization 6, Right-size locale × device matrix

Each (gl, hl, device) tuple multiplies cost. Audit your matrix:

Do you really track Spanish results for an English-only audience?
Do you need tablet AND mobile AND desktop?
Do you need both gl=us and gl=ca for English content?

Trim the matrix to what actually drives decisions. A common audit cuts the multiplier by 30-50%.

Optimization 7, Polled freshness windowing

For monitoring rank drops, you don't need to re-check every hour. Tiered polling:

Critical keywords (top revenue): every 6 hours.
Important keywords: daily.
Long-tail / discovery: weekly.

Triage by value. Same total budget; more responsive on what matters.

def schedule_for_keyword(kw):
  if kw in CRITICAL_KEYWORDS: return "every 6 hours"
  if kw in IMPORTANT_KEYWORDS: return "daily"
  return "weekly"

Optimization 8, Compress and store efficiently

The raw JSON for one SERP response can be 50-200 KB. Multiplied by millions of historical records, that's serious storage.

gzip compress at rest.
Or store only the normalized form (10-20% the size) and use the archive for raw.
For analytics, project into columnar storage (Parquet, ClickHouse) for cheap aggregation.

Optimization 9, Concurrency without overshooting

Async + batches speed up wall-clock time but don't directly save money. They DO save if:

Your free-tier window is short and you can fit work into a discounted off-peak slot.
Your provider offers volume tiers and faster runs let you complete more before re-billing tier resets.

Don't sprint just to sprint. Sprint when speed unlocks cost.

Measuring the impact

Track per-month:

Total searches issued.
Total cache hits.
Total archive re-fetches.
Total API spend.
Per-keyword cost.

A simple dashboard query gives you optimization opportunities:

SELECT
  keyword,
  COUNT(*) AS calls,
  COUNT(DISTINCT DATE(collected_at)) AS days_tracked,
  COUNT(*) * 1.0 / COUNT(DISTINCT DATE(collected_at)) AS calls_per_day
FROM serp_calls
WHERE collected_at > NOW() - INTERVAL '30 days'
GROUP BY keyword
ORDER BY calls DESC
LIMIT 50;

The top-N keywords by call count are your optimization targets.

A target spending profile

For a 10k-keyword daily SEO platform:

Pre-optimization: ~$3k/month.
After TTL caching, locale-matrix trimming, polled freshness, archive re-fetches: ~$1.2-1.5k/month.
50-60% savings, no feature loss.

The cuts come from operational discipline, not magic.

Hands-on lab

Pick your current scraper. Add the Redis-backed TTL cache. Track cache hit rate over a few days, typical is 30-70%. Then audit your (gl, hl, device) matrix. Find one tuple you don't actually use and drop it. Now compare last month's API spend with this month's. The savings are tangible.

Cost Optimization: Caching, Result Reuse, Selective Field Extraction

What you’ll learn