如何基于另一列从列中访问值？

3条回答

网友

1楼 · 编辑于 2024-06-16 09:19:31


In [2]: df = pd.DataFrame({'num': {0: 3234, 1: 3433, 2: 4443},
   ...:  'URL': {0: 'http://example.com/images/41456gn7L.jpg',
   ...:   1: 'http://example.com/images/31mndfg.jpg',
   ...:   2: 'http://example.com/images/dsfsdf8587eh.jpg'},
   ...:  'meta_data': {0: "[{'id': 0, 'imageUrl': 'http://example.com/images/41dY3ASVn7L.jpg' 'score': 54.09280014038086}, {'id': 0, 'imageUrl': 'http://examp
   ...: le.com/images/41dY3ASVn7L.jpg', 'score': 54.09280014038086}]",
   ...:   1: "[{'id': 0, 'imageUrl': 'http://example.com/images/31mnLrB5IHL.jpg' 'score': 99.902099609375}, {'id': 0, 'imageUrl': 'http://example.com/images/3
   ...: 1mnLrB5IHL.jpg', 'score': 99.902099609375}]",
   ...:   2: "[{'id': 0, 'imageUrl': 'http://example.com/images/4189TDx0e0L.jpg' 'score': 97.33160400390625}, {'id': 0, 'imageUrl': 'http://example.com/images
   ...: /4189TDx0e0L.jpg', 'score': 97.33160400390625}]"}})
   ...: file_names = ["41456gn7L.jpg","31mndfg.jpg","dsfsdf8587eh.jpg"]
   ...: df
Out[2]: 
    num                                         URL                                          meta_data
0  3234     http://example.com/images/41456gn7L.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...
1  3433       http://example.com/images/31mndfg.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...
2  4443  http://example.com/images/dsfsdf8587eh.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...

In [3]: df['Score'] = df.loc[df.URL.apply(lambda x:x.split("/")[-1]).isin(file_names), :].meta_data.apply(lambda x:x.split(",")[-1]).str.extract(r"([\d]*[.][\
   ...: d]+)")

In [4]: df
Out[4]: 
    num                                         URL                                          meta_data              Score
0  3234     http://example.com/images/41456gn7L.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...  54.09280014038086
1  3433       http://example.com/images/31mndfg.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...    99.902099609375
2  4443  http://example.com/images/dsfsdf8587eh.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...  97.33160400390625

网友

2楼 · 编辑于 2024-06-16 09:19:31

从您提供的示例df来看，元数据中的值看起来像字符串。假设它们是你在问题中提到的词典列表

file_names = ["41456gn7L.jpg","31mndfg.jpg","dsfsdf8587eh.jpg"] 
df = pd.DataFrame({'url':['http://example.com/images/41456gn7L.jpg','http://example.com/images/31mndfg.jpg','http://example.com/images/dsfsdf8587eh.jpg'],
                'meta_data':[[{'id': 0, 'imageUrl': 'http://example.com/images/41dY3ASVn7L.jpg', 'score': 54.09280014038086}, {'id': 0, 'imageUrl': 'http://example.com/images/41dY3ASVn7L.jpg', 'score': 54.09280014038086}],[{'id': 0, 'imageUrl': 'http://example.com/images/31mnLrB5IHL.jpg', 'score': 99.902099609375}, {'id': 0, 'imageUrl': 'http://example.com/images/31mnLrB5IHL.jpg', 'score': 99.902099609375}],[{'id': 0, 'imageUrl': 'http://example.com/images/4189TDx0e0L.jpg' ,'score': 97.33160400390625}, {'id': 0, 'imageUrl': 'http://example.com/images/4189TDx0e0L.jpg', 'score': 97.33160400390625}]]})

您可以从列表的first element中选择文件名出现在列表文件名中的片段，并访问与键“score”关联的值

df['score'] = df.loc[df['url'].str.rsplit('/').str[-1].isin(file_names), 'meta_data'].apply(lambda x: x[0]['score'])

    url                                         meta_data         score
0   http://example.com/images/41456gn7L.jpg     [{'id': 0, 'imageUrl': 'http://example.com/ima...   54.092800
1   http://example.com/images/31mndfg.jpg       [{'id': 0, 'imageUrl': 'http://example.com/ima...   99.902100
2   http://example.com/images/dsfsdf8587eh.jpg  [{'id': 0, 'imageUrl': 'http://example.com/ima...   97.331604

网友

3楼 · 编辑于 2024-06-16 09:19:31

希望这有助于：

metadata = [
[{'id': 0, 'imageUrl': 'http://example.com/images/41dY3ASVn7L.jpg', 'score': 54.09280014038086}, 
 {'id': 0, 'imageUrl': 'http://example.com/images/41dY3ASVn7L.jpg', 'score': 54.09280014038086}],
[{'id': 0, 'imageUrl': 'http://example.com/images/31mnLrB5IHL.jpg' ,'score': 99.902099609375}, 
 {'id': 0, 'imageUrl': 'http://example.com/images/31mnLrB5IHL.jpg', 'score': 99.902099609375}],
[{'id': 0, 'imageUrl': 'http://example.com/images/4189TDx0e0L.jpg' ,'score': 97.33160400390625}, 
 {'id': 0, 'imageUrl': 'http://example.com/images/4189TDx0e0L.jpg', 'score': 97.33160400390625}]
]
# Extract metadata into dataframe
df = pd.DataFrame([a[0] for a in metadata])

# List of filenames NOTE: added last file so match is found
fnList = ["41456gn7L.jpg","31mndfg.jpg","dsfsdf8587eh.jpg","31mnLrB5IHL.jpg"]

# show DF
print(df)
print("\n  \n Matching Filename\n")

# Generate list of matching scores
for f in fnList:
    try: v = df[df.imageUrl.str.contains(f)]['score'].iloc[0]
    except: v = None
    print(f, v)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何基于另一列从列中访问值？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >