Project E, SERP Rank-Tracking SaaS
Build a small SaaS that tracks keyword rankings across Google, Bing, and Brave, including AI Overviews and competitor analysis. The most commercial-feeling capstone.
What you’ll learn
- Build a multi-engine SERP rank tracker via SERP APIs.
- Capture AI Overview presence and the cited sources within them.
- Detect competitor movement on shared keyword sets.
- Ship a usable SaaS, auth, billing-ready UI, multi-tenant data model, even if you don't actually charge.
What you're building
A small SaaS that tracks where keywords rank for a user across multiple search engines, daily. Bonus: track AI Overview presence and the URLs cited in them, a 2025–2026 differentiator most established rank trackers haven't caught up to yet.
User flow:
1. Sign up (email / password).
2. Add a project (a domain to track).
3. Add 20–100 keywords.
4. Add 3 competitor domains.
5. (Optional) pick a target country / language per project.
6. Daily: SERP fetches across engines, position computed, AI Overview captured.
7. Dashboard shows movement, competitor positions, AI Overview citation share.
↓ behind the scenes
┌─ SERP API (Google, Bing, DuckDuckGo, Brave)
├─ AI Overview parser
├─ Postgres (users, projects, keywords, observations)
└─ Dashboard (Next.js / Filament / handrolled)
This is the most commercial-feeling capstone. The codebase is closest to what you'd actually sell. Even if you never charge, you learn the SaaS shape, auth, multi-tenant, dashboards.
Required features
- Multi-engine SERP capture, at least Google + Bing + one other (DuckDuckGo / Brave / Yandex).
- AI Overview detection + URL extraction, when the SERP contains an AI Overview, capture which URLs it cites.
- Per-keyword historical position stored daily.
- Competitor tracking, when configured, capture competitor positions on the same keyword set.
- Auth, at minimum email/password. JWT for API access if you go that far.
- Multi-tenant data isolation, user A's projects must be invisible to user B.
- One-language matrix, most users want gl=in or gl=us; document yours.
- Python + PHP, fine to split as "Python = SERP API client + analyzer; PHP/Symfony = web app + dashboard." Or the other way around.
- Browser automation for at least one operation (e.g. screenshot the SERP).
- SERP API integration is the core of the project (obviously).
- Public GitHub + deployed instance + blog post.
Stretch features
- Email digest, weekly position summary per project.
- Slack alert, keyword X dropped >5 positions overnight.
- AI Overview citation share dashboard, "Your domain appeared in X of Y AI Overviews across these N keywords this month."
- Stripe integration, actually charge $10/mo for the SaaS. The capstone graduate who ships paid customers wins the job market.
- API endpoint, let users pull their own position history programmatically.
Suggested SERP API
Per Sub-Path 3 (/learn/api-scraping/comparing-major-providers):
| Provider | Why for this project |
|---|---|
| SerpApi | Excellent Google + AI Overview parsing; free tier 100 searches/mo |
| ScraperAPI | Cheap, all-engine coverage; free trial credits |
| Bright Data SERP | Premium; pricier but most complete |
| ZenRows | Good developer experience |
| Apify | Actor-based; flexible |
For the capstone, start with the free tier of whichever provider has the cleanest AI Overview output. SerpApi has a dedicated ai_overview field in their JSON response, which is the easiest path.
Budget: ~$20–40/mo if you go past free tier. 3 keywords × 3 engines × 30 days = 270 searches/month. Easily within most free tiers if you cap at 5 keywords for the capstone demo.
Schema
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
email CITEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE projects (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id),
name TEXT NOT NULL,
domain TEXT NOT NULL,
competitor_domains TEXT[],
target_country CHAR(2),
target_language CHAR(2),
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE keywords (
id BIGSERIAL PRIMARY KEY,
project_id BIGINT NOT NULL REFERENCES projects(id),
keyword TEXT NOT NULL,
UNIQUE (project_id, keyword)
);
CREATE TABLE serp_observations (
id BIGSERIAL PRIMARY KEY,
keyword_id BIGINT NOT NULL REFERENCES keywords(id),
engine TEXT NOT NULL, -- 'google', 'bing', 'brave'...
captured_at TIMESTAMPTZ NOT NULL DEFAULT now(),
project_domain_position INTEGER, -- NULL if not in top 100
competitor_positions JSONB, -- {"competitor.com": 5...}
ai_overview_present BOOLEAN,
ai_overview_cited_urls TEXT[],
raw_json JSONB -- full SERP API response, for replay
);
CREATE INDEX ON serp_observations (keyword_id, engine, captured_at DESC);
The raw_json field is your safety net. SERP APIs change response shapes; storing the raw response lets you re-parse historical data when you upgrade your parser.
Multi-tenancy
The single most common bug in tiny SaaS apps: User A can see User B's data because the WHERE clause on user_id was forgotten.
Two defences:
- All queries via a
for_user(user_id)helper that always adds the user_id filter. Don't write raw SQL in controllers. - Postgres Row-Level Security (RLS) policies. Set
app.current_user_idper request; RLS rejects cross-user reads at the database level. Belt and braces.
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON projects
USING (user_id = current_setting('app.current_user_id')::bigint);
For a portfolio capstone, RLS is overkill but earns serious credibility. Document it; reviewers notice.
AI Overview handling
The 2025-2026 differentiator. Most legacy rank trackers (still built around the assumption that "position 1" is a stable concept) don't surface AI Overview citation properly. You can.
Three things to capture per (keyword, engine, day):
- Presence, boolean: did the SERP show an AI Overview?
- Position, does the AI Overview appear above or below the organic results? (Most SERP APIs expose this.)
- Cited URLs, the URLs the AI Overview footnoted as sources.
If a user's domain is cited in N AI Overviews this month out of M AI-Overview-enabled keywords, that's a far more useful signal than "you're at position 3 on a SERP no one clicks anymore."
Dashboard
Five views, no more:
- Project overview, current positions across all keywords, big-number summary, last-update timestamp.
- Keyword detail, sparkline of position over time across all engines; competitors overlaid.
- Movement digest, keywords that moved >3 positions in either direction in the last 7 days.
- AI Overview view, count of AI Overviews citing your domain, by week.
- Competitor matrix, for each competitor, in how many of your tracked keywords are they ahead of you.
Resist the urge to add more. Five is plenty.
Common pitfalls
- Position-1 obsession. Modern SERPs are heterogeneous. A user's #1 on a keyword whose SERP is dominated by an AI Overview and a knowledge panel may get 5% of the traffic they'd have gotten in 2018. Surface that context.
- Forgetting
glandhl. A keyword tracked from your scraper's IP without explicitgl=in&hl=enreturns whatever Google thinks your location is. Always set explicitly. - SERP volatility. Position 4 today can be position 7 tomorrow without anything changing. Use 7-day moving averages on the trend charts.
- Cross-tenant data leak. Test it. Spin up two test accounts, verify each is invisible to the other.
- SERP API rate limits. Schedule your daily fetches across hours, not all at midnight UTC.
Deployment
A $5–10/mo VPS, Docker, Postgres, your web app, daily cron. Cost-conscious version: GitHub Actions cron + Neon Postgres + Vercel/Netlify free tier for the frontend.
If you actually take payments, you'll need a stable host. Vercel / Fly.io / Hetzner Cloud all work well.
What "done" looks like
- Two demo accounts visible publicly (read-only).
- 30 days of position data captured for at least one project.
- Multi-tenant isolation verified, explicit "can't see other accounts' data" demo in the README.
- AI Overview tracking actually works on at least one keyword that triggers AI Overviews regularly.
- Blog post: SERP API choice, AI Overview parser, multi-tenancy approach, three failures, cost.
- Bonus credit if at least one real person (not a demo account) signs up and uses it.
Hands-on lab
Catalog108 has /search with a SERP-like layout (organic, knowledge panel for ?q=catalog108, local pack for ?q=stores+near+me). Build a Python parser that handles those three result types from Catalog108's HTML, that's your warm-up. Then swap the source from Catalog108's HTML to a real SERP API's JSON for actual Google queries. Same parser shape; different input.
Hands-on lab
Practice this lesson on Catalog108, our first-party scraping sandbox.
Open lab target →/searchQuiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.