构建数据帧时避免循环

transformed = pd.DataFrame(columns=['from', 'to', 'obj']) for index, row in origin.iterrows(): for obj in row['obj']: transformed = transformed.append(pd.Series({ 'from': row['from'], 'to': row['to'], 'obj': obj }), ignore_index=True)

2条回答

网友

1楼 · 编辑于 2024-04-24 14:41:53

本质上，您是根据您的列重复或链接值。你知道吗

因此，您可以根据需要使用^{}和^{}。该解决方案对于少量列是有效的，如您的示例所示。你知道吗

import numpy as np
from itertools import chain

# set up dataframe
df = pd.DataFrame({'from': ['abc', 'def', 'gfhi'],
                   'to': ['xyz', 'uvw', 'rst'],
                   'obj': [['foo', 'bar'], ['gee'], ['foo', 'bar', 'baz']]})

# calculate length of each list in obj
lens = df['obj'].map(len)

# calculate result, repeating or chaining as appropriate
res = pd.DataFrame({'from': np.repeat(df['from'], lens),
                    'to': np.repeat(df['to'], lens),
                    'obj': list(chain.from_iterable(df['obj']))})

print(res)

   from   to  obj
0   abc  xyz  foo
0   abc  xyz  bar
1   def  uvw  gee
2  gfhi  rst  foo
2  gfhi  rst  bar
2  gfhi  rst  baz

相关问题更多 >

编程相关推荐

热门问题

热门文章

构建数据帧时避免循环

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >