在迭代i时更新pandas数据帧的更好方法

print df id | A | B 01 | 374 | 2014-02-01 04:45:04.401502 02 | 284 | 2014-03-12 21:23:12.401502 03 | 183 | 2014-02-01 09:12:08.401502 for row in df.itertuples(): id = row[1] col_a = row[2] col_b = row[3] N = random.randint(2,5) for i in xrange(0, N): new_col_a = col_a + 1 new_col_b = datetime.datetime.now() #update dataframe's A, B respectively df.ix[df['id'] == id, ['A', 'B']] = [col_a, col_b] print df id | A | B 01 | 374 | 2014-02-01 04:45:04.401502 01 | 375 | 2016-12-07 07:45:04.401502 01 | 376 | 2016-12-07 07:45:04.401502 01 | 377 | 2014-12-07 07:45:04.401502 02 | 284 | 2014-03-12 21:23:12.401502 02 | 285 | 2016-12-07 07:45:04.401502 02 | 286 | 2016-12-07 07:45:04.401502 03 | 183 | 2014-02-01 09:12:08.401502 03 | 184 | 2016-12-07 07:45:04.401502 03 | 185 | 2016-12-07 07:45:04.401502 03 | 186 | 2016-12-07 07:45:04.401502

1条回答

网友

1楼 · 发布于 2024-04-20 13:11:19

不是很好的解决方案，因为循环。在

因此，首先为df自定义函数的每一行apply，其中创建新的DataFrame，并将其附加到列表dfs。然后^{}它和apply函数，它可以在自定义函数之外工作：

np.random.seed(10)
dfs = []
def expand(x):
    N = np.random.choice([2,3,4])
    df = pd.DataFrame([x.values.tolist()], columns=x.index).reindex(range(N))
    df.A = df.A.fillna(1).cumsum()
    df.insert(1,'prevA', df.A.shift())
    dfs.append(df)

df.apply(expand, axis=1)

df1 = pd.concat(dfs, ignore_index=True)
df1.A = df1.A.astype(int)
df1.id = df1.id.ffill().astype(int)
df1.prevA = df1.prevA.bfill().astype(int)
df1.B = df1.B.fillna(pd.datetime.now())

print (df1)

   id  prevA    A                          B
0   1    374  374 2014-02-01 04:45:04.401502
1   1    374  375 2016-12-07 10:48:14.299336
2   1    375  376 2016-12-07 10:48:14.299336
0   2    284  284 2014-03-12 21:23:12.401502
1   2    284  285 2016-12-07 10:48:14.299336
2   2    285  286 2016-12-07 10:48:14.299336
0   3    183  183 2014-02-01 09:12:08.401502
1   3    183  184 2016-12-07 10:48:14.299336

相关问题更多 >

编程相关推荐

热门问题

热门文章