Guide
How to Scrape Twitter/X Data in 2026
A guide to scraping Twitter/X data in 2026 covering available methods, API options, and practical approaches for tweet and profile extraction.
Twitter (now X) has significantly tightened its anti-scraping measures since the platform's 2023 changes. API access has become more restrictive and expensive. Here is the current landscape and your options.
The Current State of X/Twitter Data Access
Since Elon Musk's acquisition, the platform has:
- Made the API tiered and expensive (Basic plan starts at $200/month)
- Aggressively blocked scrapers and unauthorized access
- Rate-limited even authenticated API users
- Required login to view most content
Method 1: Official X API (Recommended for Compliance)
The official API is the safest approach, though pricing is steep:
import requests
BEARER_TOKEN = "YOUR_X_BEARER_TOKEN"
headers = {
"Authorization": f"Bearer {BEARER_TOKEN}",
"Content-Type": "application/json"
}
# Search recent tweets
params = {
"query": "web scraping -is:retweet lang:en",
"max_results": 10,
"tweet.fields": "created_at,public_metrics,author_id"
}
response = requests.get(
"https://api.twitter.com/2/tweets/search/recent",
headers=headers,
params=params
)
data = response.json()
for tweet in data.get("data", []):
print(f"Tweet: {tweet['text'][:100]}...")
print(f"Likes: {tweet['public_metrics']['like_count']}")
print("---")
Method 2: Scraping with a Browser API
For publicly visible profiles and tweets, you can use a headless browser scraping approach:
import requests
from bs4 import BeautifulSoup
# X requires JavaScript rendering
response = requests.get("https://api.scraperapi.com", params={
"api_key": "YOUR_SCRAPERAPI_KEY",
"url": "https://x.com/elonmusk",
"render": "true",
"keep_headers": "true"
})
# Note: X's heavy JS may require additional handling
soup = BeautifulSoup(response.text, "html.parser")
tweets = soup.select('[data-testid="tweet"]')
for tweet in tweets:
text_el = tweet.select_one('[data-testid="tweetText"]')
if text_el:
print(text_el.text[:200])
Method 3: Third-Party Data Providers
Several services aggregate X/Twitter data through approved channels:
- Apify Twitter Scrapers, pre-built actors for tweet and profile data
- Bright Data Twitter Dataset, pre-collected datasets
- Zyte, custom extraction for social media
Using ScrapingAnt for X
import requests
response = requests.get("https://api.scrapingant.com/v2/general", params={
"x-api-key": "YOUR_SCRAPINGANT_KEY",
"url": "https://x.com/search?q=web%20scraping&f=live",
"browser": "true",
"proxy_type": "residential",
"wait_for_selector": '[data-testid="tweet"]'
})
print(response.json()["content"][:1000])
Important Considerations
- Legal risk, X's Terms of Service explicitly prohibit scraping
- Rate limiting is aggressive even for legitimate API users
- Data freshness, cached/indexed data may be stale
- Cost, official API plans are expensive for high-volume needs
Verdict
For X/Twitter data, the official API is the safest route despite its cost. For public data, headless browser scraping via ScraperAPI or ScrapingAnt can work but requires careful implementation. Always consider the legal implications before scraping social media platforms.