HTTP in Go: net/http Without a Framework
The standard library is the framework. Build, send, customise HTTP requests in Go with net/http, headers, cookies, timeouts, proxies, and TLS settings.
What you’ll learn
- Construct GET, POST, and custom-header requests using net/http.
- Configure a Client with timeouts, transports, and proxy URLs.
- Persist cookies across requests with `cookiejar.Jar`.
- Read and parse JSON and HTML response bodies.
Go's standard library is the framework. net/http is mature, fast, and used by every Go scraping tool on the planet (utls and tls-client included). No third-party HTTP client is needed for 95% of work; the niche where you reach for tls-client is specifically TLS fingerprinting, covered in GO5.
The quickest possible GET
import (
"fmt"
"io"
"net/http"
)
func main() {
resp, err := http.Get("https://practice.scrapingcentral.com/")
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
fmt.Println(resp.StatusCode, string(body[:200]))
}
Three things to internalise:
- Always
defer resp.Body.Close()right after the error check. Skipping this leaks connections and file descriptors. It's the Go equivalent of forgettingwithin Python. io.ReadAllis how you consume the whole body. For huge bodies, stream withio.Copyinstead.http.Getuseshttp.DefaultClientunder the hood. For real scrapers you'll build your own client (see below).
Why you should always build your own Client
http.DefaultClient has no timeout. If a server hangs, your scraper hangs forever. Build a Client with sensible defaults:
import (
"net/http"
"time"
)
client := &http.Client{
Timeout: 30 * time.Second,
}
resp, err := client.Get(url)
The Timeout covers the entire request: connection + TLS + headers + body read. If you want finer control, use Transport (below) and context.Context for per-request deadlines.
Adding headers
http.Get doesn't let you set headers. Build a Request and call client.Do(req):
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return err
}
req.Header.Set("User-Agent", "Mozilla/5.0 (compatible; my-scraper/1.0)")
req.Header.Set("Accept", "text/html,application/xhtml+xml")
req.Header.Set("Accept-Language", "en-US,en;q=0.9")
resp, err := client.Do(req)
This is the form you'll use most often. Note that the default User-Agent Go sets is Go-http-client/1.1, instantly identifiable as a bot. Set a real browser UA.
POST with a body
For form POSTs:
import (
"net/url"
"strings"
)
form := url.Values{}
form.Set("username", "scraper")
form.Set("password", "letmein")
req, _ := http.NewRequest("POST", loginURL, strings.NewReader(form.Encode()))
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
resp, _ := client.Do(req)
For JSON POSTs:
import (
"bytes"
"encoding/json"
)
payload, _ := json.Marshal(map[string]string{
"query": "laptops",
"page": "2",
})
req, _ := http.NewRequest("POST", apiURL, bytes.NewReader(payload))
req.Header.Set("Content-Type", "application/json")
resp, _ := client.Do(req)
bytes.NewReader and strings.NewReader both implement io.Reader, which is what http.NewRequest wants for the body.
Parsing JSON responses
type SearchResult struct {
Total int `json:"total"`
Items []struct {
ID string `json:"id"`
Title string `json:"title"`
} `json:"items"`
}
var result SearchResult
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
return err
}
fmt.Println(result.Total, result.Items[0].Title)
The json:"total" struct tags map JSON field names to Go field names. Go field names must be capitalised (otherwise encoding/json can't see them).
For deeply nested or unknown JSON shapes, use map[string]any:
var data map[string]any
json.NewDecoder(resp.Body).Decode(&data)
// data["items"].([]any)[0].(map[string]any)["title"].(string)
It works but it's painful. Define a struct when you can.
Parsing HTML
Go's stdlib doesn't include an HTML parser. Use golang.org/x/net/html (semi-official) or the more ergonomic github.com/PuerkitoBio/goquery (jQuery-style selectors).
import "github.com/PuerkitoBio/goquery"
doc, err := goquery.NewDocumentFromReader(resp.Body)
if err != nil {
return err
}
doc.Find(".product").Each(func(i int, s *goquery.Selection) {
title := s.Find("h2").Text()
price, _ := s.Find(".price").Attr("data-price")
fmt.Println(title, price)
})
goquery's API is BeautifulSoup-shaped. If you can write BS4, you can write goquery.
Cookies, persistence
By default, http.Client does not persist cookies. Each request is stateless. To survive logins and session-based sites, attach a cookie jar:
import "net/http/cookiejar"
jar, _ := cookiejar.New(nil)
client := &http.Client{
Jar: jar,
Timeout: 30 * time.Second,
}
// Now any Set-Cookie on a response is stored, and any subsequent request
// to the same host carries those cookies.
If you want to inspect or set cookies manually:
for _, c := range jar.Cookies(parsedURL) {
fmt.Println(c.Name, c.Value)
}
Proxies
You configure proxies on the Transport, not the Client:
import (
"net/url"
"net/http"
)
proxyURL, _ := url.Parse("http://user:pass@proxy.example.com:8080")
transport := &http.Transport{
Proxy: http.ProxyURL(proxyURL),
}
client := &http.Client{
Transport: transport,
Timeout: 30 * time.Second,
}
For dynamic per-request proxies (rotation):
transport.Proxy = func(req *http.Request) (*url.URL, error) {
return pickRandomProxy(), nil
}
Both HTTP and HTTPS-over-CONNECT proxies are supported by http.ProxyURL. SOCKS proxies need golang.org/x/net/proxy.
Timeouts at three levels
A robust scraper sets timeouts at multiple levels:
- Per-client overall timeout (
Client.Timeout). - Per-request context (
context.WithTimeout), if you want different timeouts on different requests sharing one client. - Per-stage transport timeouts (
Transport.DialContext,Transport.ResponseHeaderTimeout, etc.) for fine control.
A reasonable default:
transport := &http.Transport{
DialContext: (&net.Dialer{
Timeout: 5 * time.Second,
}).DialContext,
TLSHandshakeTimeout: 5 * time.Second,
ResponseHeaderTimeout: 10 * time.Second,
IdleConnTimeout: 30 * time.Second,
MaxIdleConns: 100,
MaxIdleConnsPerHost: 10,
}
client := &http.Client{
Transport: transport,
Timeout: 30 * time.Second,
}
Tune to your workload. The defaults in http.DefaultClient and http.DefaultTransport are too generous for scrapers.
Connection reuse
Like every HTTP client, net/http keeps connections alive across requests by default. Use one Client (or one Transport) for the entire lifetime of your program, not a new one per request. Creating a new transport per request defeats keep-alive and triggers a fresh TCP + TLS handshake every time.
This is the same lesson as Foundations' "Connection: keep-alive": reuse, don't reconnect.
Following redirects
By default, Client.Do follows redirects up to 10. You can override:
client := &http.Client{
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) >= 5 {
return http.ErrUseLastResponse
}
return nil
},
}
// Or to disable redirects entirely:
client.CheckRedirect = func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
}
Disabling redirects is sometimes useful when scraping URL shorteners or when you want to capture every intermediate URL.
What this all looks like together
package main
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/cookiejar"
"time"
)
type APIItem struct {
ID string `json:"id"`
Title string `json:"title"`
}
func main() {
jar, _ := cookiejar.New(nil)
client := &http.Client{
Jar: jar,
Timeout: 30 * time.Second,
}
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
req, _ := http.NewRequestWithContext(ctx, "GET",
"https://practice.scrapingcentral.com/api/items", nil)
req.Header.Set("User-Agent", "Mozilla/5.0")
req.Header.Set("Accept", "application/json")
resp, err := client.Do(req)
if err != nil {
fmt.Println("err:", err)
return
}
defer resp.Body.Close()
var items []APIItem
if err := json.NewDecoder(resp.Body).Decode(&items); err != nil {
fmt.Println("decode:", err)
return
}
for _, it := range items {
fmt.Println(it.ID, it.Title)
}
}
50 lines. Cookies persisted. Timeouts at two levels. Real UA. JSON decoded into a typed struct. This is what production Go scrapers look like, with tls-client swapped in (GO5) when fingerprinting matters.
Where to practice
- Convert your favourite Python
requests-based scraper to Go. Don't over-optimise; just match it function-for-function. - Add a
cookiejar.Jarto the client and verify cookies persist across requests againstpractice.scrapingcentral.com. - Read Go by Example: HTTP Client and Go by Example: JSON. Both are short, both are exactly the patterns above.
Next: GO5 covers the scraping-specific reason this sub-path exists, TLS fingerprinting with utls and tls-client.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.