Geographic Targeting
Country-, region-, and city-level proxy targeting. When geography matters, how to specify it correctly, and the gotchas every scraper hits.
What you’ll learn
- Configure geographic proxy targeting in Python and PHP.
- Detect when a site is serving you region-specific content.
- Verify actual egress location after rotation.
Many sites serve different content per region, pricing, availability, language, search results, regulatory disclosures. A scraper that doesn't control egress geography sees a different site than the one it intended to scrape.
Why geography matters
| Scenario | Why it matters |
|---|---|
| E-commerce pricing | Same product, different price in US vs DE vs IN |
| SERP scraping | Google results are heavily personalized by location |
| Streaming catalogues | Netflix, Spotify, etc. catalogues vary per region |
| News and media | Articles get blocked or modified per jurisdiction |
| Travel sites | Flight prices, hotel availability vary |
| Regulatory data | Some pages only appear for jurisdictions where they apply |
If you're scraping pricing data and inadvertently use a proxy in Vietnam, you'll see Vietnam prices, not the US prices you wanted.
Provider-level targeting
Most providers expose geo-targeting through the proxy username:
# Smartproxy / Decodo style
http://user-country-de:pass@gw.provider.com:7777 # Germany
http://user-country-us-state-california:pass@... # California
http://user-country-us-city-newyork:pass@... # New York City
# Bright Data style
http://brd-customer-X-zone-Y-country-de:pass@brd.superproxy.io:22225
# Oxylabs style
http://customer-X-country-DE:pass@pr.oxylabs.io:7777
Each provider's exact syntax differs but the concept is universal. Confirm in their docs.
Country, state, city precision
Cost typically increases with precision:
- Country-level: free or small premium.
- State/region: small premium, smaller pool.
- City-level: largest premium, smallest pool.
Use the broadest level that solves your problem. City-level pinning makes sense for hyper-local SERP work but rarely for general e-commerce.
Verifying egress location
Never trust the configuration alone. Verify:
import httpx
client = httpx.Client(proxies="http://user-country-de:pass@gw.provider.com:7777")
# Check IP and inferred country
r = client.get("https://ipinfo.io/json")
print(r.json())
# {"ip": "84.x.x.x", "country": "DE", "city": "Berlin"...}
ipinfo.io, ip-api.com, httpbin.org/ip are common verification endpoints. Add a verification call at scraper startup; alert if the country doesn't match what was configured.
def assert_egress(client, expected_country):
info = client.get("https://ipinfo.io/json").json()
if info["country"] != expected_country:
raise RuntimeError(f"egress {info['country']} != {expected_country}")
This check catches:
- Misconfigured proxy strings.
- Provider issues (your "Germany" gateway accidentally serving from Netherlands).
- DNS-level geo-steering (the IP is German but the connection routed via a different egress).
Locale matters too
Geographic targeting via IP is necessary but not sufficient. Many sites also use:
- Accept-Language header. A German IP requesting Accept-Language: en-US sometimes triggers a localized response anyway.
- Cookies. Region/language cookies from previous visits.
- URL/subdomain.
amazon.devsamazon.com.
For coherent scraping, set the IP, language, and explicit region URL together:
client = httpx.Client(
proxies="http://user-country-de:pass@...",
headers={
"Accept-Language": "de-DE,de;q=0.9,en;q=0.5",
"User-Agent": "Mozilla/5.0 ... (German Chrome version)",
},
)
r = client.get("https://example.com/de/produkte")
All three signals should agree. Inconsistencies (US IP, German language header.com URL) look like a scraper.
City-level targeting, when it matters
Most use cases don't need it. Where it does:
- Local SERP. Google "restaurants near me" results vary by city block.
- Real-estate. Some real-estate sites lock listings to nearby zip codes.
- Last-mile delivery quotes. Pricing/availability set by warehouse proximity.
For these, city-level pinning is essential. Test that your provider's city actually serves the city's content, sometimes "city: Berlin" still routes to a German national IP, not a Berlin-specific one.
Geo-IP databases
Anti-bot systems use IP-to-geography databases (MaxMind GeoIP, IPinfo, etc.) to determine your country. These databases can lag reality by weeks or months. A new IP range added by a residential provider may take time to be classified correctly.
Consequence: an IP your provider says is in Germany might still be flagged "US" by an old GeoIP database the target uses. This is sometimes why geographic content appears wrong despite correct proxy config.
Common gotchas
-
Country mismatch with TLD. Scraping
amazon.comfrom a German IP shows region-localized content based on cookies and IP, not the .com TLD. Both signals matter. -
CDN routing surprises. Cloudflare and others route to the nearest edge, not the origin server's nominal country. A German IP can still hit a US-edge CDN.
-
Language cookies persist. If a previous request set a
lang=frcookie, subsequent requests are localized even from a US IP. Clear cookies when switching regions. -
Some sites geo-fence by ASN, not IP location. A residential IP from a "European" ASN routed through a US datacenter can be classified inconsistently.
Geographic strategy patterns
Per-region worker fleets
For systematic multi-region scrapes, run separate worker pools per region:
WORKERS = [
{"region": "us", "proxy": "...user-country-us..."},
{"region": "de", "proxy": "...user-country-de..."},
{"region": "jp", "proxy": "...user-country-jp..."},
]
Each worker scrapes its own region, writes to a region-tagged table. Clean separation, easy comparison.
Rotating geography
For tasks where region doesn't matter (just want IP diversity), let provider default to "any country":
http://user:pass@gw.provider.com:7777 # no country constraint
Cheaper, larger pool, randomized.
Hands-on lab
Pick a price-comparison or e-commerce target:
- Scrape one product page using a US proxy.
- Scrape the same URL using a UK proxy.
- Scrape using a German proxy.
- Compare the prices and product availability.
You'll see the difference. The exercise builds intuition for when geographic config matters, and when it doesn't.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.