Client, Server, DNS, IP, the Architecture, Foundations

The four moving parts behind every web request, what each does, and why scrapers need to understand all of them.

Before you write a single scraper, you need a clear mental model of what happens when a browser loads a page. Every scraping technique you'll learn is a variation on this one sequence.

The four actors

When you type https://practice.scrapingcentral.com/ into a browser, four things participate:

The client, your browser (or your scraper). It's the thing that initiates the request.
DNS, the global phonebook that turns the human-readable hostname (practice.scrapingcentral.com) into a numeric IP address (e.g. 185.93.228.150).
The network, Internet routers that ferry packets between your computer and the destination IP.
The server, the machine at that IP, listening on a port (443 for HTTPS). It accepts your request, generates a response, sends it back.

The browser then renders that response. Your scraper, instead, parses it.

The full sequence

You type URL  ─►  Browser asks DNS: "what's the IP for practice.scrapingcentral.com?"
  │
  ▼
  DNS responds: "185.93.228.150"
  │
  ▼
  Browser opens TCP connection to 185.93.228.150:443
  │
  ▼
  TLS handshake: encrypted channel established
  │
  ▼
  Browser sends HTTP request: GET / HTTP/1.1, Host: practice.scrapingcentral.com
  │
  ▼
  Server matches "Host: ..." → routes to the right site → generates HTML
  │
  ▼
  Server responds: HTTP/1.1 200 OK + headers + HTML body
  │
  ▼
  Browser parses HTML → fetches CSS, JS, images → renders

Every step is observable. You can watch DNS resolution with dig, the TCP/TLS handshake with openssl s_client, and the HTTP exchange with curl -v. This is the entire stack a scraper operates inside.

Why a scraper cares about DNS

Three reasons DNS isn't just "browser magic" you can ignore:

Caching. Your OS caches DNS responses. If you switch proxy regions mid-scrape, stale DNS can route traffic to the wrong egress.
Geographic targeting. Many sites return different content based on where the request appears to originate. DNS-level geo-steering (different IPs for different regions) is common.
Rate limits. Anti-bot systems often rate-limit by IP, not by hostname. If multiple hosts share an IP (cloud load balancers), tripping one rate-limit can shut you out of the whole cluster.

Hostname vs IP vs port

These three are easy to confuse and matter constantly when debugging:

Term	What it is	Example
Hostname	Human-readable label, resolved by DNS	`practice.scrapingcentral.com`
IP address	Numeric network address	`185.93.228.150`
Port	Which service on the host (HTTP = 80, HTTPS = 443)	`:443`

You can scrape via raw IP (curl https://185.93.228.150/), but you'll usually get a default site or a TLS error because the server doesn't know which virtual host you wanted. That's why HTTP requests include a Host: header, it disambiguates.

A minimal demo

In your terminal:

# 1. Resolve the hostname
dig +short practice.scrapingcentral.com
# → 185.93.228.150  (or similar)

# 2. Send a raw HTTP request, see everything the server says
curl -v https://practice.scrapingcentral.com/

# 3. See the response headers without the body
curl -I https://practice.scrapingcentral.com/

These three commands cover 80% of what scrapers do at the lowest level. Every higher-level library (Python requests, PHP Guzzle, Playwright) is a wrapper around exactly this sequence.

Hands-on lab

The lab target for this lesson is the Catalog108 homepage itself, the simplest possible target. Resolve it, hit it with curl, look at the response headers. The goal isn't to extract anything yet, it's to see the raw machinery you'll be working with for the rest of the curriculum.

Client, Server, DNS, IP, the Architecture

What you’ll learn

The four actors

The full sequence

Why a scraper cares about DNS

Hostname vs IP vs port

A minimal demo

Hands-on lab

Hands-on lab

Quiz, check your understanding

When you type 'https://practice.scrapingcentral.com/' into a browser, what is the FIRST thing the client must do before any HTTP request is sent?