Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Sub-path 1 of 6

Foundations

Prerequisites before any sub-path

How the web actually works under the hood: HTTP, HTML, CSS, XPath, DevTools, plus the Python and PHP setup that the rest of the curriculum builds on. Skip nothing here, every later lesson assumes this.

~4 weeks part-time · 20 lessons

Lessons

  1. F1

    Client, Server, DNS, IP, the Architecture

    The four moving parts behind every web request, what each does, and why scrapers need to understand all of them.

    Lab: /

    beginner
  2. F2

    HTTP Protocol: Methods, Status Codes, Headers

    The actual on-the-wire format your scraper speaks. Methods you'll use, status codes you'll interpret, headers that change behaviour.

    Lab: /

    beginner
  3. F3

    HTTPS, TLS, and Why It Matters for Scraping

    What TLS actually does, the certificate verification you'll be tempted to disable but shouldn't, and the fingerprint that gets your scraper blocked before you ever send a request.

    Lab: /

    beginner
  4. F4

    Cookies, Sessions, and Authentication Basics

    How servers remember who you are between requests, and how scrapers persist that state correctly without re-logging-in every time.

    Lab: /challenges/static/cookies/set-on-visit

    beginner
  5. F5

    HTML Structure and the DOM

    What HTML actually is, what a parser turns it into, and the tree-shaped mental model your scraper needs to navigate.

    Lab: /products

    beginner
  6. F6

    CSS Selectors, Complete Reference

    Every CSS selector you need to know, organised by what you'll actually use them for in scrapers.

    Lab: /challenges/static/lists/cards

    beginner
  7. F7

    XPath, Complete Reference

    The more powerful, less familiar query language for DOM nodes. When CSS runs out, XPath keeps going.

    Lab: /challenges/static/tables/nested

    beginner
  8. F8

    Choosing Between CSS Selectors and XPath

    When to use which. A short decision framework with concrete examples, not 'XPath is more powerful so always use XPath.'

    Lab: /products/1-yellow-ceramic-mug

    beginner
  9. F9

    Elements Panel, Inspect, Edit, Copy Selectors

    Your eyes on a live page. How to use the Elements panel to plan a scrape before writing a single line of code.

    Lab: /products

    beginner
  10. F10

    Network Panel, The Scraper's Most Important Tool

    Where the data actually lives. Use Network to find the JSON endpoint the page is hitting and skip HTML scraping entirely.

    Lab: /products/1-yellow-ceramic-mug

    beginner
  11. F11

    Console, Sources, Application Tabs

    The three other DevTools panels every scraper should know, for interactive prototyping, reading minified JS, and inspecting client-side storage.

    Lab: /account/login

    beginner
  12. F12

    Copy as cURL and Why It's a Superpower

    Right-click → Copy → Copy as cURL. The single most time-saving habit in web scraping. Here's how to use it, what to strip, and how to translate to Python or PHP.

    Lab: /products

    beginner
  13. F13

    Python, pip, venv, uv, Modern Toolchain

    Install Python correctly, isolate every project, and meet the tooling that actually makes Python pleasant in 2026.

    beginner
  14. F14

    Python Crash-Course for Scrapers

    The 5% of Python you'll use 95% of the time when writing scrapers. Strings, lists, dicts, comprehensions, file I/O, error handling, and the f-string.

    Lab: /

    beginner
  15. F15

    JSON, CSV, and Regex Essentials in Python

    The three data-handling skills you'll use in every scraper: parsing JSON responses, writing structured output, and reaching for regex without abusing it.

    Lab: /api/products

    beginner
  16. F16

    PHP, Composer, and a Modern Dev Environment

    PHP 8.x is a serious language. Here's how to set it up cleanly and use Composer the way modern PHP projects expect.

    beginner
  17. F17

    PHP Crash-Course for Scrapers

    The PHP you'll actually use to write scrapers. Arrays, strings, file I/O, error handling, JSON, and the modern PHP 8 features that make it pleasant.

    Lab: /

    beginner
  18. F18

    Why PHP Is a Legitimate Scraping Language (and When to Pick It)

    PHP is not just for WordPress. It has a serious scraping ecosystem and three concrete advantages over Python in specific situations. Here's when to reach for which.

    beginner
  19. F19

    Git and GitHub for Scraper Projects

    The minimum Git you need to keep scraper projects sane, collaborate, and ship to production. Plus the patterns specific to scraper repos.

    beginner
  20. F20

    Legal & Ethical Scraping, Your Compass

    robots.txt, Terms of Service, GDPR, the CFAA, the hiQ vs LinkedIn ruling, the legal and ethical scaffolding every scraping project should consider before code is written.

    Lab: /robots.txt

    beginner

Every lesson has a hands-on lab target on Catalog108 , our first-party practice scraping sandbox. Each lab page has a /grade endpoint that returns pass/fail on your scraper output.