Tutorial
How to Get Spotify Data - API and Scraping Guide
Learn how to extract Spotify data including tracks, playlists, artist info, and listening trends using the Spotify Web API and Python.
Spotify's Web API is one of the best music data APIs available, providing access to track metadata, audio features, playlists, and artist information. Here is how to use it effectively.
Setting Up Spotify API Access
- Create an app at https://developer.spotify.com/dashboard
- Note your Client ID and Client Secret
- Install the Spotipy library
pip install spotipy
Basic Authentication
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
auth_manager = SpotifyClientCredentials(
client_id="YOUR_CLIENT_ID",
client_secret="YOUR_CLIENT_SECRET"
)
sp = spotipy.Spotify(auth_manager=auth_manager)
Extracting Track Data
# Search for tracks
results = sp.search(q="web scraping", type="track", limit=10)
for track in results["tracks"]["items"]:
print(f"Track: {track['name']}")
print(f"Artist: {track['artists'][0]['name']}")
print(f"Album: {track['album']['name']}")
print(f"Popularity: {track['popularity']}/100")
print(f"Preview: {track['preview_url']}")
print("---")
Scraping Playlist Data
def get_playlist_tracks(playlist_id):
"""Extract all tracks from a Spotify playlist."""
tracks = []
offset = 0
while True:
results = sp.playlist_items(
playlist_id,
offset=offset,
limit=100,
fields="items(track(name,artists,album,popularity,duration_ms)),total"
)
for item in results["items"]:
track = item["track"]
if track:
tracks.append({
"name": track["name"],
"artist": track["artists"][0]["name"],
"album": track["album"]["name"],
"popularity": track["popularity"],
"duration_min": round(track["duration_ms"] / 60000, 2)
})
offset += 100
if offset >= results["total"]:
break
return tracks
# Spotify's "Today's Top Hits" playlist
playlist = get_playlist_tracks("37i9dQZF1DXcBWIGoYBM5M")
print(f"Total tracks: {len(playlist)}")
for t in playlist[:5]:
print(f" {t['name']} - {t['artist']} (Popularity: {t['popularity']})")
Audio Features Analysis
Spotify provides unique audio analysis data for every track.
def analyze_tracks(track_ids):
"""Get audio features for multiple tracks."""
features = sp.audio_features(track_ids)
for f in features:
if f:
print(f"Track ID: {f['id']}")
print(f" Danceability: {f['danceability']}")
print(f" Energy: {f['energy']}")
print(f" Tempo: {f['tempo']} BPM")
print(f" Valence: {f['valence']} (happiness)")
print(f" Key: {f['key']}, Mode: {'Major' if f['mode'] else 'Minor'}")
Artist Data Extraction
def get_artist_data(artist_name):
results = sp.search(q=f"artist:{artist_name}", type="artist", limit=1)
artist = results["artists"]["items"][0]
top_tracks = sp.artist_top_tracks(artist["id"])
return {
"name": artist["name"],
"followers": artist["followers"]["total"],
"popularity": artist["popularity"],
"genres": artist["genres"],
"top_tracks": [t["name"] for t in top_tracks["tracks"][:5]]
}
artist = get_artist_data("Radiohead")
print(f"{artist['name']}: {artist['followers']} followers")
print(f"Genres: {', '.join(artist['genres'])}")
Rate Limits and Tips
- Spotify API allows roughly 180 requests per minute
- Use batch endpoints (
audio_featuresaccepts up to 100 IDs) to minimize requests - Cache responses locally to avoid redundant API calls
- For data not available via the API (like monthly listener counts), you would need to scrape the website using ScraperAPI with rendering enabled