我正在使用来自spotify
API
(wrapper spotipy
,with sp.
)的url_analysis
工具来处理轨迹,使用以下代码:
def loudness_drops(track_ids):
names = set()
tids = set()
tracks_with_drop_name = set()
tracks_with_drop_id = set()
for id_ in track_ids:
track_id = sp.track(id_)['uri']
tids.add(track_id)
track_name = sp.track(id_)['name']
names.add(track_name)
#get audio features
features = sp.audio_features(tids)
#and then audio analysis id
urls = {x['analysis_url'] for x in features if x}
print len(urls)
#fetch analysis data
for url in urls:
# print len(urls)
analysis = sp._get(url)
#extract loudness sections from analysis
x = [_['start'] for _ in analysis['segments']]
print len(x)
l = [_['loudness_max'] for _ in analysis['segments']]
print len(l)
#get max and min values
min_l = min(l)
max_l = max(l)
#normalize stream
norm_l = [(_ - min_l)/(max_l - min_l) for _ in l]
#define silence as a value below 0.1
silence = [l[i] for i in range(len(l)) if norm_l[i] < .1]
#more than one silence means one of them happens in the middle of the track
if len(silence) > 1:
tracks_with_drop_name.add(track_name)
tracks_with_drop_id.add(track_id)
return tracks_with_drop_id
代码可以工作,但是如果歌曲的数量I search
设置为,比如说limit=20
,那么处理所有audio segments
x
和l
所需的时间会使处理成本过高,例如:
time.time()
打印452.175742149
问题:
我怎样才能大幅度降低这里的复杂性?你知道吗
我试过用sets
代替lists
,但是用set
objects
禁止indexing
。你知道吗
编辑:10urls
:
[u'https://api.spotify.com/v1/audio-analysis/5H40slc7OnTLMbXV6E780Z', u'https://api.spotify.com/v1/audio-analysis/72G49GsqYeWV6QVAqp4vl0', u'https://api.spotify.com/v1/audio-analysis/6jvFK4v3oLMPfm6g030H0g', u'https://api.spotify.com/v1/audio-analysis/351LyEn9dxRxgkl28GwQtl', u'https://api.spotify.com/v1/audio-analysis/4cRnjBH13wSYMOfOF17Ddn', u'https://api.spotify.com/v1/audio-analysis/2To3PTOTGJUtRsK3nQemP4', u'https://api.spotify.com/v1/audio-analysis/4xPRxqV9qCVeKLQ31NxhYz', u'https://api.spotify.com/v1/audio-analysis/1G1MtHxrVngvGWSQ7Fj4Oj', u'https://api.spotify.com/v1/audio-analysis/3du9aoP5vPGW1h70mIoicK', u'https://api.spotify.com/v1/audio-analysis/6VIIBKYJAKMBNQreG33lBF']
这就是我看到的,对spotify了解不多:
你不应该有任何网址两次。你做到了,因为你把所有的磁道都保存在
tids
中,然后对每个磁道,你都在tids
中处理所有的东西,这就把它的复杂性变成了O(n2)。你知道吗通常,在试图降低复杂性时,总是在循环中寻找循环。你知道吗
我相信在这种情况下,如果
audio_features
需要一组ID,那么这应该是可行的:相关问题 更多 >
编程相关推荐