数据帧的任意切片中的pythonical递增数

2024-04-23 07:38:44 发布

您现在位置:Python中文网/ 问答频道 /正文

这段代码显示了我要创建的数据帧

df = pd.DataFrame(index=pd.date_range(start='4/1/2012', periods=10))
df['foo'] = 7
df['what_i_want'] = [0,0,0,0,1,2,3,0,0,0]

结果如下:

    foo what_i_want
2012-04-01  7   0
2012-04-02  7   0
2012-04-03  7   0
2012-04-04  7   0
2012-04-05  7   1
2012-04-06  7   2
2012-04-07  7   3
2012-04-08  7   0
2012-04-09  7   0
2012-04-10  7   0

我试图找出一种方法,在序列的任意切片上创建这些1,2,...,n序列。例如:df['2012-04-05':'2012-04-07'] = magic_function()

但我不知道如何在不使用循环的情况下做到这一点。你知道吗


Tags: 数据方法代码dataframedfdateindexfoo
3条回答

首先通过切片的length提取具有range的新Series的索引:

idx = df.loc['2012-04-05':'2012-04-07'].index
df['new'] = pd.Series(range(1, len(idx)+1), index=idx).reindex(df.index, fill_value=0)

或赋值range,但有必要替换NaN并转换为int

l = len(df.loc['2012-04-05':'2012-04-07'].index)
df.loc['2012-04-05':'2012-04-07', 'new'] = range(1, l+1)
df['new'] = df['new'].fillna(0).astype(int)
print (df)
            foo  new
2012-04-01    7    0
2012-04-02    7    0
2012-04-03    7    0
2012-04-04    7    0
2012-04-05    7    1
2012-04-06    7    2
2012-04-07    7    3
2012-04-08    7    0
2012-04-09    7    0
2012-04-10    7    0

IIUC,您可以使用loc进行切片并分配range。你知道吗

df['what_i_want'] = 0
df.loc['2012-04-05':'2012-04-07', 'what_i_want'] = range(1, 4)

df

            foo  what_i_want
2012-04-01    7            0
2012-04-02    7            0
2012-04-03    7            0
2012-04-04    7            0
2012-04-05    7            1
2012-04-06    7            2
2012-04-07    7            3
2012-04-08    7            0
2012-04-09    7            0
2012-04-10    7            0

你可以这样做:

df.loc['2012-04-08':'2012-04-10']['what_i_want']= \
df.loc['2012-04-08':'2012-04-10'].apply(lambda x:1, axis=1).cumsum()

将选定值转换为1s后,对其使用累积和

相关问题 更多 >