Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

F13beginner5 min read

Python, pip, venv, uv, Modern Toolchain

Install Python correctly, isolate every project, and meet the tooling that actually makes Python pleasant in 2026.

What you’ll learn

  • Install a recent Python (3.11+) without breaking your system Python.
  • Create a virtual environment per project and understand why.
  • Install dependencies with pip, and know when uv is the better choice.
  • Pin versions in a requirements file so your scraper is reproducible.

The Python landscape is messy. There are multiple installers, two package managers, a half-dozen ways to make a virtual environment, and contradictory advice everywhere. Here's the short version that works.

Install a recent Python

Don't use your operating system's bundled Python. macOS ships an old one for backward compatibility; Linux distros ship one for the OS itself; Windows often ships none. Install a fresh one alongside.

Platform Recommended installer Why
macOS Homebrew: brew install python@3.12 Easy upgrades, no admin password
Linux Distro package (e.g. apt install python3.12) or pyenv OS package works; pyenv lets you have multiple versions
Windows python.org installer, check "Add Python to PATH" Cleanest; or use Microsoft Store version

Verify:

python3 --version
# Python 3.12.x

For scraping you want 3.10 at minimum (for pattern matching) and ideally 3.11+ (for performance, Python 3.11 is ~25% faster than 3.10).

Why virtual environments

If you pip install requests globally, every Python project on your machine shares that exact version of requests. The moment two projects need different versions, you're stuck.

A virtual environment is a private directory with its own python and pip that installs into a per-project library folder. One project, one venv, no version collisions.

Create one with venv (built-in)

cd /path/to/my-scraper-project
python3 -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .venv\Scripts\activate  # Windows
pip install requests beautifulsoup4

.venv (the dot-prefixed name) is the modern convention, it's git-ignored by default in most templates.

After activate, your terminal prompt usually shows (.venv) and python/pip point at the venv binaries. To leave: deactivate.

Always git-ignore the venv

# .gitignore
.venv/
__pycache__/
*.pyc

Never commit a venv. They contain absolute paths and OS-specific binaries, they can't be shared. What you commit is the list of packages (next section).

pip and requirements files

pip install is the standard installer. Pin what you've installed:

pip install requests beautifulsoup4 lxml
pip freeze > requirements.txt

requirements.txt now contains exact versions:

beautifulsoup4==4.12.3
certifi==2024.2.2
charset-normalizer==3.3.2
idna==3.6
lxml==5.1.0
requests==2.31.0
soupsieve==2.5
urllib3==2.2.0

On another machine (or in CI):

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Same versions, reproducible install. The pinning matters, requests 2.31 and 2.32 are not guaranteed to behave identically.

uv, the modern alternative

uv (from Astral, makers of ruff) is a drop-in replacement for pip + venv that's 10–100× faster. In late-2025 it's mature enough to recommend for new projects.

# Install uv (one-time)
curl -LsSf https://astral.sh/uv/install.sh | sh
# or: pip install uv

# Create + activate a venv
uv venv

# Install dependencies (no need to activate first)
uv pip install requests beautifulsoup4 lxml

# Lock to a file
uv pip freeze > requirements.txt

uv is fully compatible with pip's command flags, you can rename pip to uv pip mechanically. The speed difference matters most when you have a dozen scrapers each rebuilding their venv in CI.

A typical project layout

my-scraper/
├── .venv/  ← gitignored
├── .gitignore
├── README.md
├── requirements.txt  ← pinned dependencies
├── pyproject.toml  ← (optional) project metadata
├── src/
│  └── my_scraper/
│  ├── __init__.py
│  ├── client.py  ← HTTP session, retries
│  ├── parsers.py  ← BeautifulSoup / lxml selectors
│  └── store.py  ← write to CSV / SQLite
├── scripts/
│  └── crawl.py  ← entry point
└── tests/
  └── test_parsers.py

This isn't mandatory, small one-file scrapers don't need this, but as a project grows past one file, this is the shape most professional scraping projects converge on.

pyproject.toml (when ready)

Beyond requirements.txt, pyproject.toml is the modern packaging manifest. Useful when:

  • You want to install your scraper as a CLI command (pip install -e .)
  • You're publishing to PyPI
  • You're using tools that respect pyproject (ruff, black, mypy, pytest)

Minimal version:

[project]
name = "my-scraper"
version = "0.1.0"
dependencies = [
  "requests>=2.31",
  "beautifulsoup4>=4.12",
  "lxml>=5.0",
]

For now, requirements.txt is enough. Move to pyproject.toml when the project warrants it.

Common gotchas

  1. python vs python3. On most macOS/Linux installs, python is Python 2 (still!) or doesn't exist. Always use python3 and pip3 unless you've explicitly set up an alias.

  2. System-wide pip install. If you ever see error: externally-managed-environment on a fresh Python install, that's the OS protecting itself. Use a venv. Never sudo pip install.

  3. Venv doesn't auto-activate. Each new terminal needs source .venv/bin/activate (or use direnv / mise to auto-activate per directory).

  4. pip freeze vs pip list. freeze outputs in requirements.txt format. list is human-readable. Always freeze for the file you commit.

Hands-on lab

Create a directory, set up a venv, install requests and beautifulsoup4, save a requirements.txt. Then write a one-liner:

import requests
from bs4 import BeautifulSoup

r = requests.get("https://practice.scrapingcentral.com/")
soup = BeautifulSoup(r.text, "html.parser")
print(soup.title.string)

Run it inside your venv. You should see the page title printed. You now have a working Python scraping environment.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

Python, pip, venv, uv, Modern Toolchain1 / 8

Why is a per-project virtual environment recommended for every scraping project?

Score so far: 0 / 0