Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Tutorial

How to Get Spotify Data - API and Scraping Guide

Learn how to extract Spotify data including tracks, playlists, artist info, and listening trends using the Spotify Web API and Python.

Spotify's Web API is one of the best music data APIs available, providing access to track metadata, audio features, playlists, and artist information. Here is how to use it effectively.

Setting Up Spotify API Access

  1. Create an app at https://developer.spotify.com/dashboard
  2. Note your Client ID and Client Secret
  3. Install the Spotipy library
pip install spotipy

Basic Authentication

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

auth_manager = SpotifyClientCredentials(
    client_id="YOUR_CLIENT_ID",
    client_secret="YOUR_CLIENT_SECRET"
)
sp = spotipy.Spotify(auth_manager=auth_manager)

Extracting Track Data

# Search for tracks
results = sp.search(q="web scraping", type="track", limit=10)

for track in results["tracks"]["items"]:
    print(f"Track: {track['name']}")
    print(f"Artist: {track['artists'][0]['name']}")
    print(f"Album: {track['album']['name']}")
    print(f"Popularity: {track['popularity']}/100")
    print(f"Preview: {track['preview_url']}")
    print("---")

Scraping Playlist Data

def get_playlist_tracks(playlist_id):
    """Extract all tracks from a Spotify playlist."""
    tracks = []
    offset = 0

    while True:
        results = sp.playlist_items(
            playlist_id,
            offset=offset,
            limit=100,
            fields="items(track(name,artists,album,popularity,duration_ms)),total"
        )

        for item in results["items"]:
            track = item["track"]
            if track:
                tracks.append({
                    "name": track["name"],
                    "artist": track["artists"][0]["name"],
                    "album": track["album"]["name"],
                    "popularity": track["popularity"],
                    "duration_min": round(track["duration_ms"] / 60000, 2)
                })

        offset += 100
        if offset >= results["total"]:
            break

    return tracks

# Spotify's "Today's Top Hits" playlist
playlist = get_playlist_tracks("37i9dQZF1DXcBWIGoYBM5M")
print(f"Total tracks: {len(playlist)}")
for t in playlist[:5]:
    print(f"  {t['name']} - {t['artist']} (Popularity: {t['popularity']})")

Audio Features Analysis

Spotify provides unique audio analysis data for every track.

def analyze_tracks(track_ids):
    """Get audio features for multiple tracks."""
    features = sp.audio_features(track_ids)

    for f in features:
        if f:
            print(f"Track ID: {f['id']}")
            print(f"  Danceability: {f['danceability']}")
            print(f"  Energy: {f['energy']}")
            print(f"  Tempo: {f['tempo']} BPM")
            print(f"  Valence: {f['valence']} (happiness)")
            print(f"  Key: {f['key']}, Mode: {'Major' if f['mode'] else 'Minor'}")

Artist Data Extraction

def get_artist_data(artist_name):
    results = sp.search(q=f"artist:{artist_name}", type="artist", limit=1)
    artist = results["artists"]["items"][0]

    top_tracks = sp.artist_top_tracks(artist["id"])

    return {
        "name": artist["name"],
        "followers": artist["followers"]["total"],
        "popularity": artist["popularity"],
        "genres": artist["genres"],
        "top_tracks": [t["name"] for t in top_tracks["tracks"][:5]]
    }

artist = get_artist_data("Radiohead")
print(f"{artist['name']}: {artist['followers']} followers")
print(f"Genres: {', '.join(artist['genres'])}")

Rate Limits and Tips

  • Spotify API allows roughly 180 requests per minute
  • Use batch endpoints (audio_features accepts up to 100 IDs) to minimize requests
  • Cache responses locally to avoid redundant API calls
  • For data not available via the API (like monthly listener counts), you would need to scrape the website using ScraperAPI with rendering enabled