Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

F1beginner3 min read

Client, Server, DNS, IP, the Architecture

The four moving parts behind every web request, what each does, and why scrapers need to understand all of them.

What you’ll learn

  • Name the four actors in a web request: client, DNS, IP routing, server.
  • Trace what happens between typing a URL and seeing a page render.
  • Explain why DNS resolution affects scrapers (rate limits, geographic targeting, caching).
  • Recognise the difference between a hostname, an IP, and a port.

Before you write a single scraper, you need a clear mental model of what happens when a browser loads a page. Every scraping technique you'll learn is a variation on this one sequence.

The four actors

When you type https://practice.scrapingcentral.com/ into a browser, four things participate:

  1. The client, your browser (or your scraper). It's the thing that initiates the request.
  2. DNS, the global phonebook that turns the human-readable hostname (practice.scrapingcentral.com) into a numeric IP address (e.g. 185.93.228.150).
  3. The network, Internet routers that ferry packets between your computer and the destination IP.
  4. The server, the machine at that IP, listening on a port (443 for HTTPS). It accepts your request, generates a response, sends it back.

The browser then renders that response. Your scraper, instead, parses it.

The full sequence

You type URL  ─►  Browser asks DNS: "what's the IP for practice.scrapingcentral.com?"
  │
  ▼
  DNS responds: "185.93.228.150"
  │
  ▼
  Browser opens TCP connection to 185.93.228.150:443
  │
  ▼
  TLS handshake: encrypted channel established
  │
  ▼
  Browser sends HTTP request: GET / HTTP/1.1, Host: practice.scrapingcentral.com
  │
  ▼
  Server matches "Host: ..." → routes to the right site → generates HTML
  │
  ▼
  Server responds: HTTP/1.1 200 OK + headers + HTML body
  │
  ▼
  Browser parses HTML → fetches CSS, JS, images → renders

Every step is observable. You can watch DNS resolution with dig, the TCP/TLS handshake with openssl s_client, and the HTTP exchange with curl -v. This is the entire stack a scraper operates inside.

Why a scraper cares about DNS

Three reasons DNS isn't just "browser magic" you can ignore:

  • Caching. Your OS caches DNS responses. If you switch proxy regions mid-scrape, stale DNS can route traffic to the wrong egress.
  • Geographic targeting. Many sites return different content based on where the request appears to originate. DNS-level geo-steering (different IPs for different regions) is common.
  • Rate limits. Anti-bot systems often rate-limit by IP, not by hostname. If multiple hosts share an IP (cloud load balancers), tripping one rate-limit can shut you out of the whole cluster.

Hostname vs IP vs port

These three are easy to confuse and matter constantly when debugging:

Term What it is Example
Hostname Human-readable label, resolved by DNS practice.scrapingcentral.com
IP address Numeric network address 185.93.228.150
Port Which service on the host (HTTP = 80, HTTPS = 443) :443

You can scrape via raw IP (curl https://185.93.228.150/), but you'll usually get a default site or a TLS error because the server doesn't know which virtual host you wanted. That's why HTTP requests include a Host: header, it disambiguates.

A minimal demo

In your terminal:

# 1. Resolve the hostname
dig +short practice.scrapingcentral.com
# → 185.93.228.150  (or similar)

# 2. Send a raw HTTP request, see everything the server says
curl -v https://practice.scrapingcentral.com/

# 3. See the response headers without the body
curl -I https://practice.scrapingcentral.com/

These three commands cover 80% of what scrapers do at the lowest level. Every higher-level library (Python requests, PHP Guzzle, Playwright) is a wrapper around exactly this sequence.

Hands-on lab

The lab target for this lesson is the Catalog108 homepage itself, the simplest possible target. Resolve it, hit it with curl, look at the response headers. The goal isn't to extract anything yet, it's to see the raw machinery you'll be working with for the rest of the curriculum.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Client, Server, DNS, IP, the Architecture1 / 8

When you type 'https://practice.scrapingcentral.com/' into a browser, what is the FIRST thing the client must do before any HTTP request is sent?

Score so far: 0 / 0