Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Using Playwright with Proxies

Learn to configure Playwright with HTTP, SOCKS5, and rotating proxies for anonymous web scraping and IP rotation.

Browser Automation · #10intermediate2 min read
Share:WhatsAppLinkedIn

When scraping at scale, using your own IP address will quickly get you blocked. Proxies route your requests through different IP addresses, making it harder for websites to identify and block your scraper. Playwright has built-in proxy support at both the browser and context level.

Browser-Level Proxy

Set a proxy for the entire browser instance:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        headless=True,
        proxy={
            "server": "http://proxy-server.com:8080",
            "username": "your_username",
            "password": "your_password"
        }
    )
    page = browser.new_page()
    page.goto("https://httpbin.org/ip")
    print(page.content())  # Shows the proxy IP
    browser.close()

Context-Level Proxy

Use different proxies for different browser contexts within the same browser:

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)

    # Context 1 with proxy A
    context1 = browser.new_context(proxy={
        "server": "http://proxy-a.com:8080"
    })

    # Context 2 with proxy B
    context2 = browser.new_context(proxy={
        "server": "http://proxy-b.com:8080"
    })

    page1 = context1.new_page()
    page2 = context2.new_page()

    page1.goto("https://httpbin.org/ip")
    page2.goto("https://httpbin.org/ip")

    print("Proxy A IP:", page1.inner_text("body"))
    print("Proxy B IP:", page2.inner_text("body"))

    browser.close()

SOCKS5 Proxy

Playwright supports SOCKS5 proxies as well:

browser = p.chromium.launch(
    proxy={"server": "socks5://proxy-server.com:1080"}
)

Rotating Proxies

For large-scale scraping, rotate through a pool of proxies:

from playwright.sync_api import sync_playwright
import random

PROXY_LIST = [
    "http://user:pass@proxy1.com:8080",
    "http://user:pass@proxy2.com:8080",
    "http://user:pass@proxy3.com:8080",
    "http://user:pass@proxy4.com:8080",
]

urls_to_scrape = [
    "https://example.com/page/1",
    "https://example.com/page/2",
    "https://example.com/page/3",
]

with sync_playwright() as p:
    for url in urls_to_scrape:
        proxy = random.choice(PROXY_LIST)
        browser = p.chromium.launch(
            headless=True,
            proxy={"server": proxy}
        )
        page = browser.new_page()

        try:
            page.goto(url, timeout=30000)
            page.wait_for_selector("body")
            print(f"Scraped {url} via {proxy}")
        except Exception as e:
            print(f"Failed {url}: {e}")
        finally:
            browser.close()

Proxy with Authentication

If your proxy requires authentication and the method above does not work, you can handle HTTP authentication:

context = browser.new_context(
    http_credentials={
        "username": "proxy_user",
        "password": "proxy_pass"
    }
)

Built-In Proxy Services

Managing your own proxy pool is complex. ScraperAPI provides automatic proxy rotation and geo-targeting through a simple API endpoint. Just prepend their API URL to your target URL and they handle the rest. ScrapingAnt similarly offers residential proxies with their scraping API, removing the need to source and manage proxies yourself.

Next Steps

  • Set up Selenium with proxies
  • Learn browser fingerprinting and stealth techniques
  • Explore parallel browser scraping