自动Spotipy达到最大重试次数
我正在尝试用Spotipy从我所有的歌曲(5985首)中分批获取音频特征,每次处理100首歌。我之前用的一个简单版本的代码没有异常处理,前1416首歌都能正常工作,但之后就完全卡住了。于是我尝试用下面的代码来解决这个问题:
def exponential_backoff_retry(func, *args, max_retries=3, base_delay=1, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except SpotifyException as e:
if e.http_status == 429:
print(f"Rate limited. Retrying in {base_delay * 2 ** attempt} seconds.")
time.sleep(base_delay * 2 ** attempt)
else:
raise e
print("Max retries exceeded. Unable to fetch track features.")
return None
def get_user_saved_track_features(sp, ids, start_track_id=None, batch_size=100):
tracks = []
# Iterate through each batch of track IDs
batches = [ids[i:i+batch_size] for i in range(start_track_id or 0, len(ids), batch_size)]
for batch in batches:
for track_id in batch:
meta = exponential_backoff_retry(sp.track, track_id)
name = meta['name']
album = meta['album']['name']
artist = meta['album']['artists'][0]['name']
release_date = meta['album']['release_date']
length = meta['duration_ms']
popularity = meta['popularity']
print(f"Processed meta for track ID {track_id}")
print(f"Processed all metatracks")
batch_features = exponential_backoff_retry(sp.audio_features, batch)
if batch_features:
for features in batch_features:
if features and features[0]:
print(f"Processing features {track_id}")
acousticness = features['acousticness']
danceability = features['danceability']
energy = features['energy']
instrumentalness = features['instrumentalness']
liveness = features['liveness']
loudness = features['loudness']
speechiness = features['speechiness']
tempo = features['tempo']
valence = features['valence']
time_signature = features['time_signature']
key = features['key']
mode = features['mode']
uri = features['uri']
tracks.append([name, album, artist, release_date, length, popularity,
acousticness, danceability, energy, instrumentalness,
liveness, loudness, speechiness, tempo, valence,
time_signature, key, mode, uri])
print(f"Processed track ID audio features {track_id}")
else:
print(f"Skipping track ID {track_id} because at least one feature value is None")
time.sleep(1) # Sleep for 1 second per song
elif batch_features is None:
print(f"Skipping batch due to error")
time.sleep(1) # Sleep for 1 second per batch to avoid rate limiting
# Create DataFrame from the list of track features
df = pd.DataFrame(tracks, columns=['name', 'album', 'artist', 'release_date',
'length', 'popularity', 'acousticness', 'danceability',
'energy', 'instrumentalness', 'liveness', 'loudness',
'speechiness', 'tempo', 'valence', 'time_signature',
'key', 'mode', 'uri'])
return df
奇怪的是,这部分的请求每次都能成功:
for track_id in batch:
meta = exponential_backoff_retry(sp.track, track_id)
name = meta['name']
album = meta['album']['name']
artist = meta['album']['artists'][0]['name']
release_date = meta['album']['release_date']
length = meta['duration_ms']
popularity = meta['popularity']
print(f"Processed meta for track ID {track_id}")
但是从batch_features = exponential_backoff_retry(sp.audio_features, batch)
开始,'最大重试次数已达到'的错误从第一批甚至第一项特征就自动被捕获了。
我还尝试过更改start_id
,但没有成功。
1 个回答
0
根据你发布的错误信息,(代码是429)的错误响应是HTTP协议中的一个状态码,表示用户在一定时间内发送了太多请求(这叫做“速率限制”)。这个状态码通常由网站服务器使用,以防止滥用并确保资源的公平使用。
简单来说,你可以:
- 查看网站的请求限制,并遵守这些限制。
- 逐渐减少请求次数,直到达到可以接受的水平。
了解这些限制可能会更好。
顺便说一下,速率限制是非常常见的。