我有如下所示的数据帧(df)
输入
ShipID CustomerCode
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006'] USWPR04
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002'] MSLPR04
我需要创建名为df['LinkID']
的新列,它是上述列的嵌套数组
输出
df['LinkID']
[{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },
{ "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },
{ "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]
[{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },
{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]
最终数据帧输出
ShipID CustomerCode link
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006'] USWPR04 [{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },{ "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002'] MSLPR04 [{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]
如何做到这一点
更新答案:
步骤:
eval
李>ShipID
上分解数据帧李>.str.split
方法提取shipNumber
李>to_dict('records')
并再次将其加载到数据帧中李>groupby
和agg
使用list
将其转换回原始结构李>相关问题 更多 >
编程相关推荐