假设我有一个如下表:例如,在ID2下,这一行只显示该id的状态和日期的历史记录
*id*, *status*, *date*, *Type*
2 dissolved 2016/03/19 T1
nan active NaT nan
3 dissolved 2016/03/19, T3
nan active 2012/03/16 nan
4 in liquidation, 2017/03/19 T2
nan dissolved, NaT nan
我要做的是按id组合行,例如,对于第一个id=2,我得到:
*id*, *status*, *date*, *Type*
2 [dissolved,active] [2016/03/19,None] T1
我试过:
data.groupby(['id']).agg(lambda x: tuple(x)).applymap(list).reset_index()
但这将id列设置为[2,nan],而我只需要2。我该怎么办?我只想为我的状态和日期值列表格式,而不是元组所有他们了
输入df:
df = pd.DataFrame([[2,"dissolved","2016/03/19","T1" ],
[float("nan"),"active","NaT",float("nan")],
[3,"dissolved","2016/03/19","T3" ],
[float("nan"),"active","2012/03/16",float("nan")],
[4,"in liquidation","2017/03/19","T2" ],
[float("nan"),"dissolved","NaT",float("nan")]],columns = ["id","status","date","Type"])
请尝试以下操作:
根据您的数据集,您可能需要对其进行一些调整。它的输出如下:
相关问题 更多 >
编程相关推荐