自动Spotipy达到最大重试次数

0 投票
1 回答
31 浏览
提问于 2025-04-12 19:33

我正在尝试用Spotipy从我所有的歌曲(5985首)中分批获取音频特征,每次处理100首歌。我之前用的一个简单版本的代码没有异常处理,前1416首歌都能正常工作,但之后就完全卡住了。于是我尝试用下面的代码来解决这个问题:

def exponential_backoff_retry(func, *args, max_retries=3, base_delay=1, **kwargs):
    for attempt in range(max_retries):
        try:
            return func(*args, **kwargs)
        except SpotifyException as e:
            if e.http_status == 429:
                print(f"Rate limited. Retrying in {base_delay * 2 ** attempt} seconds.")
                time.sleep(base_delay * 2 ** attempt)
            else:
                raise e
    print("Max retries exceeded. Unable to fetch track features.")
    return None


def get_user_saved_track_features(sp, ids, start_track_id=None, batch_size=100):
    tracks = []

    # Iterate through each batch of track IDs
    batches = [ids[i:i+batch_size] for i in range(start_track_id or 0, len(ids), batch_size)]
    for batch in batches:
        for track_id in batch:
            meta = exponential_backoff_retry(sp.track, track_id)
            name = meta['name']
            album = meta['album']['name']
            artist = meta['album']['artists'][0]['name']
            release_date = meta['album']['release_date']
            length = meta['duration_ms']
            popularity = meta['popularity']
            print(f"Processed meta for track ID {track_id}")
        
        print(f"Processed all metatracks")

        batch_features = exponential_backoff_retry(sp.audio_features, batch)

        if batch_features:
            for features in batch_features:
                if features and features[0]:
                    print(f"Processing features {track_id}")
                    acousticness = features['acousticness']
                    danceability = features['danceability']
                    energy = features['energy']
                    instrumentalness = features['instrumentalness']
                    liveness = features['liveness']
                    loudness = features['loudness']
                    speechiness = features['speechiness']
                    tempo = features['tempo']
                    valence = features['valence']
                    time_signature = features['time_signature']
                    key = features['key']
                    mode = features['mode']
                    uri = features['uri']

                    tracks.append([name, album, artist, release_date, length, popularity,
                                   acousticness, danceability, energy, instrumentalness,
                                   liveness, loudness, speechiness, tempo, valence,
                                   time_signature, key, mode, uri])
                    print(f"Processed track ID audio features {track_id}")
                else:
                    print(f"Skipping track ID {track_id} because at least one feature value is None")

                time.sleep(1)  # Sleep for 1 second per song
        elif batch_features is None:
            print(f"Skipping batch due to error")

        time.sleep(1) # Sleep for 1 second per batch to avoid rate limiting
            
    # Create DataFrame from the list of track features
    df = pd.DataFrame(tracks, columns=['name', 'album', 'artist', 'release_date',
                                       'length', 'popularity', 'acousticness', 'danceability',
                                       'energy', 'instrumentalness', 'liveness', 'loudness',
                                       'speechiness', 'tempo', 'valence', 'time_signature',
                                       'key', 'mode', 'uri'])

    return df

奇怪的是,这部分的请求每次都能成功:

        for track_id in batch:
            meta = exponential_backoff_retry(sp.track, track_id)
            name = meta['name']
            album = meta['album']['name']
            artist = meta['album']['artists'][0]['name']
            release_date = meta['album']['release_date']
            length = meta['duration_ms']
            popularity = meta['popularity']
            print(f"Processed meta for track ID {track_id}")

但是从batch_features = exponential_backoff_retry(sp.audio_features, batch)开始,'最大重试次数已达到'的错误从第一批甚至第一项特征就自动被捕获了。

我还尝试过更改start_id,但没有成功。

1 个回答

0

根据你发布的错误信息,(代码是429)的错误响应是HTTP协议中的一个状态码,表示用户在一定时间内发送了太多请求(这叫做“速率限制”)。这个状态码通常由网站服务器使用,以防止滥用并确保资源的公平使用。

简单来说,你可以:

  1. 查看网站的请求限制,并遵守这些限制。
  2. 逐渐减少请求次数,直到达到可以接受的水平。

了解这些限制可能会更好。

顺便说一下,速率限制是非常常见的。

撰写回答