Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Free curriculum

Web Scraping: Complete Learning Path

A structured, hands-on curriculum that takes you from “what is HTTP” to running production scrapers at scale. Every lesson comes with a quiz and a real lab target on Catalog108, our first-party practice sandbox.

100% freePython + PHPHands-on labsAuto-graded

The path

Five sub-paths plus a final mastery project. Each sub-path is shippable on its own, you can stop after Static Scraping and already be employable.

  1. 1

    Foundations

    ~4 weeks part-time · 20 lessons

    Prerequisites before any sub-path

    How the web actually works under the hood: HTTP, HTML, CSS, XPath, DevTools, plus the Python and PHP setup that the rest of the curriculum builds on. Skip nothing here, every later lesson assumes this.

    20 lessons published →

  2. 2

    Static Scraping

    ~6 weeks part-time · 34 lessons

    HTTP + HTML. Fast, lightweight. Python and PHP.

    Send requests, parse HTML, follow pagination, submit forms, store results. Taught in Python (requests + BeautifulSoup + lxml) and PHP (Guzzle + DomCrawler), equally first-class. Every lesson lands on a stable lab target at Catalog108.

    34 lessons published →

  3. 3

    Dynamic Web & Browser Automation

    ~4 weeks part-time · 30 lessons

    When static fails, drive a real browser.

    For JS-rendered sites, SPAs, infinite scroll, modals, iframes, Shadow DOM. Playwright is the main tool, with Selenium and Puppeteer for completeness. Each lesson runs against the dynamic challenges at Catalog108.

    30 lessons published →

  4. 4

    APIs, SERPs & Reverse Engineering

    ~8 weeks part-time · 50 lessons

    Skip the HTML. Hit JSON directly.

    The pro path: REST, GraphQL, auth flows (cookie, JWT, OAuth, HMAC), reverse-engineering minified JS, and a complete tour of SERP-scraping APIs. The deepest, highest-leverage sub-path.

    50 lessons published →

  5. 5

    Production, Scale & Career

    ~10 weeks part-time · 85 lessons

    Run everything at scale, reliably.

    Scrapy and Symfony for production scrapers. Async, proxies, fingerprinting, CAPTCHAs, distributed crawling, monitoring, deployment, and the legal/career framing that turns this into a livelihood.

    85 lessons published →

  6. 6

    Final Mastery Project

    ~4 weeks part-time · 1 project

    Ship the one project that proves it.

    Pick a multi-source data product, build it end-to-end, deploy it, document it. Five suggested capstones, price intelligence, jobs analytics, real-estate, public data, or SERP rank tracker.

    7 lessons published →

Why this curriculum exists

  • Two languages, equally first-class. Most courses pick Python or PHP and ignore the other.
  • First-party labs at Catalog108. No dependency on external sandboxes that disappear or rate-limit you.
  • Auto-graded labs. Submit your scraper’s output and get pass/fail, not passive reading.
  • Reverse engineering, taught explicitly. Almost no other free course covers this.