我有一个数据帧,如下所示:
tags categories classification
0 label ['legislative',
'law, govt and
politics', 'exe... None
0 document ['legislative',
'law, govt and politics',
'exe... NaN
0 text ['legislative', 'law,
govt and politics',
'exe... NaN
0 paper ['legislative', 'law,
govt and
politics', 'exe... NaN
0 poster ['legislative', 'law,
govt and politics', 'exe... NaN
我想创建一个新的数据框,在这里我可以将上面的数据框折叠成下面的数据框,这样列“tags”和“classification”的列元素将转换成单个行,其中包含列表格式的单个项,例如
tags categories classification
0 ['label', ['legislative', ['None','NaN',
'document', 'law, govt and 'NaN','NaN',
'text', politics', 'exe... 'NaN']
'paper',poster']
我该怎么做呢?如何使用堆栈或group by函数来获得结果?提前谢谢
*以下是df.to_dict()的结果
{'tags': {0: ' letter',
1: ' head',
2: ' water',
3: ' art',
4: ' indoors',
5: ' flyer',
6: ' poster',
...},
'categories': {0: "['legislative', 'law, govt and politics',
'executive branch', 'work', 'society', 'government']",
1: "['unrest and war', 'society', 'religion and spirituality',
'buddhism']",
2: '[]',
3: '[]',
4: "['unemployment', 'society', 'law, govt and politics',
'foreign policy', 'work', 'politics', 'armed forces']",
5: '[]',
6: "['sports', 'law, govt and politics', 'wrestling']",
...},
'classfication': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
...}}
我没有完全理解你的问题。但是你想要这样的东西吗
df:
转换后的df:
相关问题 更多 >
编程相关推荐