Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

1.3beginner4 min read

POST Requests: Form Data and JSON Payloads

When scraping requires sending data, search forms, login forms, JSON APIs, you need POST. Master the three body formats and when each applies.

What you’ll learn

  • Identify the three common POST body formats: form-encoded, multipart, JSON.
  • Send each from Python `requests` using `data=`, `files=`, and `json=`.
  • Read the request's `Content-Type` to confirm what you sent.
  • Match the body format the server expects by inspecting browser DevTools.

GET fetches. POST submits. Whenever a scraper has to fill out a search form, log in, or talk to a JSON API, it's a POST. The mechanics are simple once you understand the three body formats and when each is used.

The three POST body formats

Format Content-Type When you see it
Form-encoded application/x-www-form-urlencoded HTML <form method="post"> without enctype
Multipart multipart/form-data HTML forms with file uploads, enctype="multipart/form-data"
JSON application/json Modern web apps, REST APIs

The body format is determined by the Content-Type header. Your scraper must match exactly what the server expects, or you'll get cryptic 400 errors.

How to figure out what a real form sends

Open browser DevTools, switch to the Network tab, submit the form manually once. Click the request, look at:

  • Request URL, where to POST.
  • Request Method, confirm it's POST.
  • Request HeadersContent-Type, tells you the body format.
  • Form Data or Request Payload, the actual fields and values.

This three-minute reconnaissance saves an hour of guessing.

Form-encoded POST (data=)

The classic HTML form. Pass a dict to data=:

import requests

r = requests.post(
  "https://practice.scrapingcentral.com/challenges/static/forms/post",
  data={"username": "alice", "color": "blue"},
)
print(r.status_code, r.text[:200])

requests sets Content-Type: application/x-www-form-urlencoded automatically and serializes the dict as username=alice&color=blue. This is what 90% of legacy HTML forms expect.

Multipart POST (files=)

Used when you upload files, or when the form was declared enctype="multipart/form-data":

files = {
  "avatar": ("photo.jpg", open("photo.jpg", "rb"), "image/jpeg"),
}
data = {"username": "alice"}
r = requests.post(url, files=files, data=data)

Note: when you pass files=, requests automatically switches the Content-Type to multipart/form-data with a generated boundary. You can include regular fields via data= at the same time, they ride along inside the same multipart body.

You can also use multipart without any file at all (some APIs require it):

files = {"field": (None, "value")}
r = requests.post(url, files=files)

JSON POST (json=)

Modern APIs almost always want JSON:

r = requests.post(
  "https://practice.scrapingcentral.com/api/products",
  json={"name": "New product", "price": 9.99},
)

json= does two things at once: serializes the dict to JSON with json.dumps(), AND sets Content-Type: application/json. Do NOT pass data=json.dumps(...), it'll send JSON in the body but with the wrong Content-Type, and the server will reject it.

Common mistake: mixing them up

# Wrong, server expects JSON
requests.post(url, data={"name": "thing"})
# Sends: Content-Type: application/x-www-form-urlencoded
#  Body: name=thing
# Server tries to JSON-parse it → 400 Bad Request

# Right
requests.post(url, json={"name": "thing"})
# Sends: Content-Type: application/json
#  Body: {"name": "thing"}

When a POST mysteriously fails with 400, check Content-Type first. It's the #1 bug.

Inspecting what you sent

r = requests.post(url, json={"k": "v"})
print(r.request.body)  # b'{"k": "v"}'
print(r.request.headers["Content-Type"])  # application/json

r.request.body is the raw bytes you sent. If the server is unhappy, compare this side-by-side with what the browser sends in DevTools.

Reading the response

POST responses can be HTML, JSON, redirects, or anything else:

r = requests.post(url, data=payload)
if r.headers.get("Content-Type", "").startswith("application/json"):
  data = r.json()
else:
  # HTML, parse with BeautifulSoup
  ...

Many forms POST and then redirect to a success page. r.url after a redirect tells you where you landed; r.history shows the redirect chain.

Idempotency

GET requests are idempotent, repeating them is safe. POST requests usually aren't: a second POST to a "create order" endpoint creates a second order. Be careful in retry logic (Lesson 1.6), repeating a failed POST might double-submit. When in doubt, GET the resource list afterward to confirm whether the first POST actually succeeded.

A complete worked example

import requests

s = requests.Session()
s.headers["User-Agent"] = "Mozilla/5.0 (compatible; learning-scraper)"

# Submit a search form
r = s.post(
  "https://practice.scrapingcentral.com/challenges/static/forms/post",
  data={"q": "mug", "page": 1},
  timeout=10,
)
r.raise_for_status()
print("Final URL:", r.url)
print("Body preview:", r.text[:300])

We used a Session here. Sessions are how cookies persist across requests, the next lesson.

Hands-on lab

Open /challenges/static/forms/post in your browser. Use DevTools to inspect the form's <form> tag: what's its action, method, enctype? Submit it once manually, look at the Network tab to confirm the body format, then replicate the POST from Python. Compare your r.request.body to what the browser sent. They should be identical.

Hands-on lab

Practice this lesson on Catalog108, our first-party scraping sandbox.

Open lab target → /challenges/static/forms/post

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

POST Requests: Form Data and JSON Payloads1 / 8

Which `requests` argument sends a body as JSON and sets `Content-Type: application/json` automatically?

Score so far: 0 / 0