spotify's api

getting an access token for the api

head over to spotify's developer dashboard
click the Create app button
enter all the necessary information: App name, App description, Website, etc..
for Redirect URI enter http://localhost:5555
hit the terms of service agreement checkbox
click the Settings button, you will find your Client ID and you need to press View client secret to get your Client secret, save both of these as you'll need them

api authorization

first, we need to gain access "keys" (so we can make requests to various api backends/routes), and because we need to read private data of the user, we need to use Authorization Code Flow so first step is getting an auth code which we can then use to fetch a refresh token which in turn we can use to fetch an access token with which we can make make requests to almost any of spotify's api backends to get an auth token we need to generate a url for the user (us) to visit in the browser:

import urllib.parse

params = {'client_id': CLIENT_ID,
          'scope': 'user-read-private user-read-playback-state user-modify-playback-state user-read-currently-playing app-remote-control playlist-read-private playlist-modify-private playlist-modify-public user-read-playback-position user-library-modify user-library-read',
          'redirect_uri': 'http://localhost:5555',
          'response_type': 'code'}
url = 'https://accounts.spotify.com/authorize?' + urllib.parse.urlencode(params)
url

https://accounts.spotify.com/authorize?client_id=e934fee88c884e9ea1e4ad8a37bae1df&scope=user-read-private+user-read-playback-state+user-modify-playback-state+user-read-currently-playing+app-remote-control+playlist-read-private+playlist-modify-private+playlist-modify-public+user-read-playback-position+user-library-modify+user-library-read&redirect_uri=http%3A%2F%2Flocalhost%3A5555&response_type=code

but before opening the url in the browser, to be able to grab the auth code after making the request, we need a local http server running, to which spotify will pass the auth code:

python -m http.server 5555

with the webpage open and the local http server running, when you complete authentication with your spotify account you should see the auth code in your webserver's terminal output and in your browser aswell, it will be something like the following:

Serving HTTP on 0.0.0.0 port 5555 (http://0.0.0.0:5555/) ...
^[127.0.0.1 - - [09/Oct/2023 17:36:17] "GET /?code=AQCg0RZ0C8NXzy0n3JpngvNrzx8Fs-vY2BpPn6sZlufBkeUgESfzoaoiiymlsaTtz0dQ6T8OxpNKNNztpAZ-E_0ZjA7TzLG8gTxza27GYSmHswxCHwZM3AA_n7onaCUBWscD_nVII1jPfHadvfUe_FfLt3UGup8DHCfo5lMnBQEtIVNWKBfyuVmDZQX2TFhPrwC8pmp_JzOmLYnjXxxFRmwyv2VQZ4rtTjN0hpVoa91-1azmtdXWQw6gMXOe1T4SsGS9mXNZmwGUo-JJNRjH7K0FkfHfbytHM5sV3UI07IZpnI1yBFkMGqQIkeyNiX8UZNKwc3kaw-WjQZh8NBZXMo48U0XfLxybxE0R_k9IVlf2PUInBmy39kMGHWoFpJT8cJpayooR0pYw_gl4ubH_DBRVwGhuX14pF2CzgFmwlT4sXh5TN-4yPFpcP8sSnpwsN07eQimOMPliT2nzf3RPZ14hXtccAC9jxJ-m-ZwvcFlWiuZPLZf_G6TCeGdz_1Md HTTP/1.1" 200 -
127.0.0.1 - - [09/Oct/2023 17:36:17] code 404, message File not found
127.0.0.1 - - [09/Oct/2023 17:36:17] "GET /favicon.ico HTTP/1.1" 404 -

the desired code is after code= we're gonna need some libraries:

import requests
import base64

now that we have the auth code we need to grab a refresh token

def fetch_refresh_token():
    auth = 'Basic ' + base64.b64encode(f'{CLIENT_ID}:{CLIENT_SECRET}'.encode()).decode()
    r = requests.post("https://accounts.spotify.com/api/token",
                      headers = {'Content-Type': 'application/x-www-form-urlencoded',
                                 'Authorization': auth},
                      data = {'grant_type': 'authorization_code',
                              'code': AUTH_CODE,
                              'redirect_uri': 'http://localhost:5555'})
    return r.json()['refresh_token']

refresh_token = fetch_refresh_token()
refresh_token

AQAi4IB-FWWHGM20nQ5UkCfqLzupzkhXAdhQ_Z6MQ9SPRavN_vgUw8h91Zm9kJSuY9QXCLA_3GbE4L0Prdqmd0NSkJYqdqxjFq9yRkhbqXnHWjOKsCK8RYB7ug0U9Gl4yJk

even though the refresh token api route also returns an access token, such tokens are only valid for one hour, so we need to write a function to fetch a "refreshed" access token for when it expires:

def fetch_access_token(refresh_token):
    # spotify requires base64 encoding in this form
    auth = 'Basic ' + base64.b64encode(f'{CLIENT_ID}:{CLIENT_SECRET}'.encode()).decode()
    r = requests.post("https://accounts.spotify.com/api/token",
                      headers = {'Content-Type': 'application/x-www-form-urlencoded',
                                 'Authorization': auth},
                      data = {'refresh_token': refresh_token,
                              'grant_type': 'refresh_token'})
    return r.json()['access_token']

access_token = fetch_access_token(REFRESH_TOKEN)
access_token

BQDLhfDBmXx-rTS378MgNs4DjCG6Fm2I8v7F6yFpI8tw4SqX6O4XJv-lV6vTMAdVHOGTo59aDInXWXz6o8N68wDLLk4KSVXgm6fu9BjtsHU8timRhSJi5Sax3zX1hCUOTI8nTEl3ZKhME1iV-L_GaxbAqbXse4zetJiI0QUZX9n7HGTenDQcqOnEY25CWp-MVNUciZVoqSLRE4w2J_hvRVo3P2IYV88Z6Z-3p_Oa1IqwmM7b3W6Zqz65pXvlfdKw-Aqai5uo9cT9JmMz9iFDgxE0V-QftVxT6c9OSGVt-NdgpiJQuxcRzeUW5A

api requests

with this access token we can now make requests to other routes of spotifys web api, e.g. to the /search route:

def search(query, access_token, object_type='album'):
    auth = f'Bearer {access_token}'
    r = requests.get('https://api.spotify.com/v1/search',
                     params = {'type': object_type,
                               'q': query},
                     headers = {'Authorization': auth})
    return r.json()

import pprint
search_results = search('hymn to the immortal wind', access_token)
pprint.pprint([(album['name'],album['artists'][0]['name']) for album in search_results['albums']['items'][:5]])

[('Hymn to the Immortal Wind (Anniversary Edition)', 'MONO'),
 ('Hymn to the Sea A Capella', 'Andrea Krux'),
 ('Hymn Of Heaven (Acoustic Sessions)', 'Phil Wickham'),
 ('Hymn to the Sea (From "Titanic")', "Jacob's Piano"),
 ('My Immortal', 'Savella')]

this function fetches your entire spotify library into an array of tracks:

import time

def fetch_library(access_token, tracks=[], url="https://api.spotify.com/v1/me/tracks?limit=50"):
    r = requests.get(url, headers={'Authorization': 'Bearer ' + access_token})
    if r.status_code != 200: # we've hit rate limit
        # time.sleep(1)
        # fetch_library(access_token, tracks, url)
        return
    r_data = r.json()
    for item in r_data['items']:
        track = item['track']
        data = {
            'id': track['id'],
            'name': track['name'],
            'images': track['album']['images'],
            'artist': track['artists'][0]['name'],
            'album': track['album']['name']
        }
        tracks.append(data)
    if 'next' in r_data and r_data['next']:
        fetch_library(access_token, tracks, url=r_data['next'])
    return tracks

example usage:

import json

track_list = fetch_library(access_token)
with open('tracks.json', 'w+') as data_file:
  data_file.write(json.dumps(track_list, indent=2))

downloading with spotdl

now that we have our beloved tracks library in tracks.json we can use a tool like spotdl to download our music for offline listening (independently from spotify) this script downloads all tracks using the metadata in tracks.json:

import json
import os
from multiprocessing import Pool

def download_track(track_id):
    COMMAND = "spotdl download 'https://open.spotify.com/track/" + track_id + "' --output '{artist}/{album}/{title}--{track-id}' --print-errors --save-errors errors.spotdl --save-file saved.spotdl --no-cache --lyrics --m3u '{list}' --max-retries 3 --add-unavailable --force-update-metadata --generate-lrc"
    os.system(COMMAND)

# i think using ThreadPoolExecutor wouldve been a better option here
def download_all(tracks):
    track_idx = 0
    POOL_SIZE = 10
    while True:
        with Pool(POOL_SIZE) as pool:
            pool.map(download_track,
                    [track['id'] for track in tracks[track_idx:track_idx+POOL_SIZE]])
            track_idx = track_idx + POOL_SIZE
            if track_idx >= len(tracks): # we're done
                return

with open('tracks.json') as data_file:
    track_list = json.loads(data_file.read())

download_all(track_list)

a similar script but to download the albums along with the tracks:

import json
import os
from multiprocessing import Pool

def download_album_by_track_id(track_id):
    COMMAND = "spotdl download 'https://open.spotify.com/track/" + track_id + "' --output '{artist}/{album}/{title}--{track-id}' --print-errors --save-errors errors.spotdl --save-file saved.spotdl --no-cache --lyrics --m3u '{list}' --max-retries 3 --add-unavailable --force-update-metadata --generate-lrc --fetch-albums"
    os.system(COMMAND)

# i think using ThreadPoolExecutor wouldve been a better option here
def download_all(tracks):
    track_idx = 0
    POOL_SIZE = 10
    while True:
        with Pool(POOL_SIZE) as pool:
            pool.map(download_album_by_track_id,
                    [track['id'] for track in tracks[track_idx:track_idx+POOL_SIZE]])
            track_idx = track_idx + POOL_SIZE
            if track_idx >= len(tracks): # we're done
                return

with open('tracks.json') as data_file:
    track_list = json.loads(data_file.read())

download_all(track_list)

although this can be done with spotdl alone and if your only intention is it download your music library all of this code is superfluous