从pandas到dictionary，第一列中的值将是键，第二列中相应的值都将在lis中

t gid 0 2010.0 67290 1 2020.0 92780 2 2040.0 92780 3 2060.0 92780 4 2090.0 92780 5 2110.0 92780 6 2140.0 92780 7 2190.0 92780 8 2010.0 69110 9 2010.0 78420 10 2020.0 78420 11 2020.0 78420 12 2030.0 78420 13 2040.0 78420

import pandas as pd df1 = pd.read_pickle('stack.pkl') %timeit -n 2 df1.groupby('gid')['t'].apply(list).to_dict() 2 loops, best of 3: 4.76 s per loop %timeit -n 2 df1.groupby('gid')['t'].apply(lambda x: x.tolist()).to_dict() 2 loops, best of 3: 4.21 s per loop %timeit -n 2 df1.groupby('gid', sort=False)['t'].apply(list).to_dict() 2 loops, best of 3: 4.84 s per loop %timeit -n 2 {name: group.tolist() for name, group in df1.groupby('gid')['t']} 2 loops, best of 3: 4 s per loop %timeit -n 2 {name: group.tolist() for name, group in df1.groupby('gid', sort=False)['t']} 2 loops, best of 3: 3.96 s per loop %timeit -n 2 {name: group['t'].tolist() for name, group in df1.groupby('gid', sort=False)} 2 loops, best of 3: 7.16 s per loop

2条回答

网友
1楼 · 编辑于 2024-04-24 08:56:28

还有一个答案不适用。你知道吗
d = {name: group.tolist() for name, group in df.groupby('gid')['t']} {67290: [2010.0], 69110: [2010.0], 78420: [2010.0, 2020.0, 2020.0, 2030.0, 2040.0], 92780: [2020.0, 2040.0, 2060.0, 2090.0, 2110.0, 2140.0, 2190.0]}

网友
2楼 · 编辑于 2024-04-24 08:56:28

尝试从^{}创建的list的Series创建dictionary：
#if necessary convert column to int df.t = df.t.astype(int) d = df.groupby('gid')['t'].apply(list).to_dict() print (d) {92780: [2020, 2040, 2060, 2090, 2110, 2140, 2190], 67290: [2010], 78420: [2010, 2020, 2020, 2030, 2040], 69110: [2010]} print (d[78420]) [2010, 2020, 2020, 2030, 2040]
如果性能很重要，请将sort=False参数添加到groupby：
d = df.groupby('gid', sort=False)['t'].apply(list).to_dict() d = {name: group.tolist() for name, group in df.groupby('gid', sort=False)['t']} d = {name: group['t'].tolist() for name, group in df.groupby('gid', sort=False)}

相关问题更多 >

编程相关推荐

热门问题

热门文章