POST Requests: Form Data and JSON Payloads, Static Scraping

When scraping requires sending data, search forms, login forms, JSON APIs, you need POST. Master the three body formats and when each applies.

GET fetches. POST submits. Whenever a scraper has to fill out a search form, log in, or talk to a JSON API, it's a POST. The mechanics are simple once you understand the three body formats and when each is used.

The three POST body formats

Format	Content-Type	When you see it
Form-encoded	`application/x-www-form-urlencoded`	HTML `<form method="post">` without `enctype`
Multipart	`multipart/form-data`	HTML forms with file uploads, `enctype="multipart/form-data"`
JSON	`application/json`	Modern web apps, REST APIs

The body format is determined by the Content-Type header. Your scraper must match exactly what the server expects, or you'll get cryptic 400 errors.

How to figure out what a real form sends

Open browser DevTools, switch to the Network tab, submit the form manually once. Click the request, look at:

Request URL, where to POST.
Request Method, confirm it's POST.
Request Headers → Content-Type, tells you the body format.
Form Data or Request Payload, the actual fields and values.

This three-minute reconnaissance saves an hour of guessing.

Form-encoded POST (`data=`)

The classic HTML form. Pass a dict to data=:

import requests

r = requests.post(
  "https://practice.scrapingcentral.com/challenges/static/forms/post",
  data={"username": "alice", "color": "blue"},
)
print(r.status_code, r.text[:200])

requests sets Content-Type: application/x-www-form-urlencoded automatically and serializes the dict as username=alice&color=blue. This is what 90% of legacy HTML forms expect.

Multipart POST (`files=`)

Used when you upload files, or when the form was declared enctype="multipart/form-data":

files = {
  "avatar": ("photo.jpg", open("photo.jpg", "rb"), "image/jpeg"),
}
data = {"username": "alice"}
r = requests.post(url, files=files, data=data)

Note: when you pass files=, requests automatically switches the Content-Type to multipart/form-data with a generated boundary. You can include regular fields via data= at the same time, they ride along inside the same multipart body.

You can also use multipart without any file at all (some APIs require it):

files = {"field": (None, "value")}
r = requests.post(url, files=files)

JSON POST (`json=`)

Modern APIs almost always want JSON:

r = requests.post(
  "https://practice.scrapingcentral.com/api/products",
  json={"name": "New product", "price": 9.99},
)

json= does two things at once: serializes the dict to JSON with json.dumps(), AND sets Content-Type: application/json. Do NOT pass data=json.dumps(...), it'll send JSON in the body but with the wrong Content-Type, and the server will reject it.

Common mistake: mixing them up

# Wrong, server expects JSON
requests.post(url, data={"name": "thing"})
# Sends: Content-Type: application/x-www-form-urlencoded
#  Body: name=thing
# Server tries to JSON-parse it → 400 Bad Request

# Right
requests.post(url, json={"name": "thing"})
# Sends: Content-Type: application/json
#  Body: {"name": "thing"}

When a POST mysteriously fails with 400, check Content-Type first. It's the #1 bug.

Inspecting what you sent

r = requests.post(url, json={"k": "v"})
print(r.request.body)  # b'{"k": "v"}'
print(r.request.headers["Content-Type"])  # application/json

r.request.body is the raw bytes you sent. If the server is unhappy, compare this side-by-side with what the browser sends in DevTools.

Reading the response

POST responses can be HTML, JSON, redirects, or anything else:

r = requests.post(url, data=payload)
if r.headers.get("Content-Type", "").startswith("application/json"):
  data = r.json()
else:
  # HTML, parse with BeautifulSoup
  ...

Many forms POST and then redirect to a success page. r.url after a redirect tells you where you landed; r.history shows the redirect chain.

Idempotency

GET requests are idempotent, repeating them is safe. POST requests usually aren't: a second POST to a "create order" endpoint creates a second order. Be careful in retry logic (Lesson 1.6), repeating a failed POST might double-submit. When in doubt, GET the resource list afterward to confirm whether the first POST actually succeeded.

A complete worked example

import requests

s = requests.Session()
s.headers["User-Agent"] = "Mozilla/5.0 (compatible; learning-scraper)"

# Submit a search form
r = s.post(
  "https://practice.scrapingcentral.com/challenges/static/forms/post",
  data={"q": "mug", "page": 1},
  timeout=10,
)
r.raise_for_status()
print("Final URL:", r.url)
print("Body preview:", r.text[:300])

We used a Session here. Sessions are how cookies persist across requests, the next lesson.

Hands-on lab

Open /challenges/static/forms/post in your browser. Use DevTools to inspect the form's <form> tag: what's its action, method, enctype? Submit it once manually, look at the Network tab to confirm the body format, then replicate the POST from Python. Compare your r.request.body to what the browser sent. They should be identical.

POST Requests: Form Data and JSON Payloads

What you’ll learn

The three POST body formats

How to figure out what a real form sends

Form-encoded POST (`data=`)

Multipart POST (`files=`)

JSON POST (`json=`)

Common mistake: mixing them up

Inspecting what you sent

Reading the response

Idempotency

A complete worked example

Hands-on lab

Hands-on lab

Quiz, check your understanding

Which `requests` argument sends a body as JSON and sets `Content-Type: application/json` automatically?

POST Requests: Form Data and JSON Payloads

What you’ll learn

The three POST body formats

How to figure out what a real form sends

Form-encoded POST (data=)

Multipart POST (files=)

JSON POST (json=)

Common mistake: mixing them up

Inspecting what you sent

Reading the response

Idempotency

A complete worked example

Hands-on lab

Hands-on lab

Quiz, check your understanding

Which `requests` argument sends a body as JSON and sets `Content-Type: application/json` automatically?

Form-encoded POST (`data=`)

Multipart POST (`files=`)

JSON POST (`json=`)