How to Scrape Twitter/X Data in 2026

A guide to scraping Twitter/X data in 2026 covering available methods, API options, and practical approaches for tweet and profile extraction.

Twitter (now X) has significantly tightened its anti-scraping measures since the platform's 2023 changes. API access has become more restrictive and expensive. Here is the current landscape and your options.

The Current State of X/Twitter Data Access

Since Elon Musk's acquisition, the platform has:

Made the API tiered and expensive (Basic plan starts at $200/month)
Aggressively blocked scrapers and unauthorized access
Rate-limited even authenticated API users
Required login to view most content

Method 1: Official X API (Recommended for Compliance)

The official API is the safest approach, though pricing is steep:

import requests

BEARER_TOKEN = "YOUR_X_BEARER_TOKEN"

headers = {
    "Authorization": f"Bearer {BEARER_TOKEN}",
    "Content-Type": "application/json"
}

# Search recent tweets
params = {
    "query": "web scraping -is:retweet lang:en",
    "max_results": 10,
    "tweet.fields": "created_at,public_metrics,author_id"
}

response = requests.get(
    "https://api.twitter.com/2/tweets/search/recent",
    headers=headers,
    params=params
)

data = response.json()
for tweet in data.get("data", []):
    print(f"Tweet: {tweet['text'][:100]}...")
    print(f"Likes: {tweet['public_metrics']['like_count']}")
    print("---")

Method 2: Scraping with a Browser API

For publicly visible profiles and tweets, you can use a headless browser scraping approach:

import requests
from bs4 import BeautifulSoup

# X requires JavaScript rendering
response = requests.get("https://api.scraperapi.com", params={
    "api_key": "YOUR_SCRAPERAPI_KEY",
    "url": "https://x.com/elonmusk",
    "render": "true",
    "keep_headers": "true"
})

# Note: X's heavy JS may require additional handling
soup = BeautifulSoup(response.text, "html.parser")
tweets = soup.select('[data-testid="tweet"]')

for tweet in tweets:
    text_el = tweet.select_one('[data-testid="tweetText"]')
    if text_el:
        print(text_el.text[:200])

Method 3: Third-Party Data Providers

Several services aggregate X/Twitter data through approved channels:

Apify Twitter Scrapers, pre-built actors for tweet and profile data
Bright Data Twitter Dataset, pre-collected datasets
Zyte, custom extraction for social media

Using ScrapingAnt for X

import requests

response = requests.get("https://api.scrapingant.com/v2/general", params={
    "x-api-key": "YOUR_SCRAPINGANT_KEY",
    "url": "https://x.com/search?q=web%20scraping&f=live",
    "browser": "true",
    "proxy_type": "residential",
    "wait_for_selector": '[data-testid="tweet"]'
})

print(response.json()["content"][:1000])

Important Considerations

Legal risk, X's Terms of Service explicitly prohibit scraping
Rate limiting is aggressive even for legitimate API users
Data freshness, cached/indexed data may be stale
Cost, official API plans are expensive for high-volume needs

Verdict

For X/Twitter data, the official API is the safest route despite its cost. For public data, headless browser scraping via ScraperAPI or ScrapingAnt can work but requires careful implementation. Always consider the legal implications before scraping social media platforms.