Guide
Lead Generation with Web Scraping
Learn how to use web scraping for B2B lead generation. Extract business contacts, emails, and company data from directories and social platforms.
Web scraping is one of the most effective ways to build targeted B2B lead lists. Here is how to do it efficiently and ethically.
What Lead Data to Collect
- Company name and website
- Business email addresses
- Phone numbers
- Industry and company size
- Key decision makers
- Social media profiles
- Technology stack (via BuiltWith/Wappalyzer)
Best Sources for Lead Data
| Source | Data Available | Difficulty |
|---|---|---|
| Google Maps | Business name, phone, address, website | Medium |
| Professional profiles, companies | Hard | |
| Yellow Pages | Business listings by category | Easy |
| Crunchbase | Startup data, funding, founders | Medium |
| Industry directories | Niche-specific businesses | Easy |
| Company websites | Contact info, team pages | Easy |
Scraping Business Directories
import requests
from bs4 import BeautifulSoup
API_KEY = "YOUR_SCRAPERAPI_KEY"
# Example: scraping a business directory
url = "https://www.yellowpages.com/search?search_terms=plumbing&geo_location_terms=Chicago"
resp = requests.get(
f"http://api.scraperapi.com?api_key={API_KEY}&url={url}"
)
soup = BeautifulSoup(resp.text, "html.parser")
listings = soup.select(".result")
for listing in listings:
name = listing.select_one(".business-name")
phone = listing.select_one(".phones")
print(f"{name.text.strip() if name else 'N/A'}: {phone.text.strip() if phone else 'N/A'}")
Finding Email Addresses
Once you have a company domain, find email patterns:
# Common email patterns
patterns = [
"firstname@domain.com",
"firstname.lastname@domain.com",
"f.lastname@domain.com",
"firstnamel@domain.com",
]
# Verify with an email verification API
Enriching Lead Data
After collecting basic leads, enrich them with additional data:
- Company website, Scrape for tech stack, team size, recent news
- Social profiles, LinkedIn, Twitter for engagement data
- Review sites, Glassdoor for company health indicators
- News mentions, Recent press coverage and announcements
Using ScraperAPI for Lead Generation
ScraperAPI is ideal for lead generation scraping because it handles the anti-bot measures on directories and social platforms.
For scraping Google Maps results specifically, ScrapingAnt offers strong residential proxy support that helps avoid Google's aggressive blocking.
Building Your Pipeline
Sources → Scraping → Cleaning → Enrichment → CRM
↓ ↓ ↓ ↓ ↓
Directories ScraperAPI Dedup APIs Salesforce
Google Maps Python Validate Scraping HubSpot
LinkedIn Playwright Format LLMs Outreach
Legal and Ethical Guidelines
- Only collect business contact info, Not personal data
- Comply with CAN-SPAM and GDPR, Especially for email outreach
- Honor opt-out requests, Immediately remove people who ask
- Do not scrape private data, Only publicly listed information
- Verify data quality, Bad leads waste everyone's time
- Add value in outreach, Nobody wants spam; offer something relevant