从dataframe列将类型提取到列表中

2024-06-16 14:16:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个dataframe看起来像这样-

id  genres
1   [{'id': 35, 'name': 'Comedy'}]
2   [{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10751, 'name': 'Family'}, {'id': 10749, 'name': 'Romance'}]
3   [{'id':31, 'name':'Romance'}]

我想从每个row中提取流派,并将它们存储在list中。例如-

id  genres
1   ['Comedy']
2   ['Comedy','Drama','Family','Romance']
3   ['Romance']

我试过了- [j['name'] for i in data['genres'] for j in i] 但它将所有行写入一个列表。你知道吗


Tags: nameiniddataframe列表fordatafamily
3条回答

使用嵌套列表理解:

data['genres'] = [[j['name'] for j in i] for i in data['genres']]

对于更一般的解决方案,最好是get-如果不存在name键,则不失败,而是返回None或另一个指定值:

data['genres'] = [[j.get('name') for j in i] for i in data['genres']]

data['genres'] = [[j.get('name', 'missing') for j in i] for i in data['genres']]

print (data)
   id                            genres
0   1                          [Comedy]
1   2  [Comedy, Drama, Family, Romance]
2   3                         [Romance]

另一种可能的方法是使用apply():

df['genres'] = df['genres'].apply(lambda x: [d.get('name') for d in x])

使用apply

例如:

import pandas as pd

df = pd.DataFrame({"genres": [[{'id': 35, 'name': 'Comedy'}],[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10751, 'name': 'Family'}, {'id': 10749, 'name': 'Romance'}],[{'id':31, 'name':'Comedy'}]]})
df["genres"] = df["genres"].apply(lambda x: [i["name"] for i in x])
print(df)

输出:

                             genres
0                          [Comedy]
1  [Comedy, Drama, Family, Romance]
2                          [Comedy]

相关问题 更多 >