基于单列创建Pandas中的数字范围

2024-05-23 08:06:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框:

df2 = pd.DataFrame({'ID':['A','B','C','D','E'], 'loc':['Lon','Tok','Ber','Ams','Rom'], 'start':[20,10,30,40,43]})


    ID  loc     start
0   A   Lon     20
1   B   Tok     10
2   C   Ber     30
3   D   Ams     40
4   E   Rom     43

我希望添加一个名为range的列,它接受“start”中的值,并生成一个比初始值小10的值范围(包括初始值),所有值都在同一行中

所需输出:

    ID  loc     start    range
0   A   Lon     20       20,19,18,17,16,15,14,13,12,11,10
1   B   Tok     10       10,9,8,7,6,5,4,3,2,1,0
2   C   Ber     30       30,29,28,27,26,25,24,23,22,21,20
3   D   Ams     40       40,39,38,37,36,35,34,33,32,31,30
4   E   Rom     43       43,42,41,40,39,38,37,36,35,34,33

我试过:

df2['range'] = [i for i in range(df2.start, df2.start -10)]

def create_range2(row):
  
  return df2['start'].between(df2.start, df2.start - 10)
  

df2.loc[:, 'range'] = df2.apply(create_range2, axis = 1)

然而,我似乎无法获得所需的输出。我打算将此解决方案应用于多个数据帧,其中一个具有>;200万行

谢谢


Tags: 数据iddataframeromcreaterangestartloc
2条回答

您可以准备range creating函数和.apply它以以下方式启动列:

import pandas as pd
df2 = pd.DataFrame({'ID':['A','B','C','D','E'], 'loc':['Lon','Tok','Ber','Ams','Rom'], 'start':[20,10,30,40,43]})
def make_10(x):
    return list(range(x, x-10-1, -1))
df2["range"] = df2["start"].apply(make_10)
print(df2)

输出

  ID  loc  start                                         range
0  A  Lon     20  [20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10]
1  B  Tok     10            [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
2  C  Ber     30  [30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20]
3  D  Ams     40  [40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30]
4  E  Rom     43  [43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33]

说明:.apply方法的pandas.Series(列的pandas.DataFrame)接受函数,该函数按元素应用。请注意,range中有-1,因为它是包含独占的,-1是步长,因为您希望使用降序值

这行吗

df2['range'] = df2.apply(lambda row: list(range(row['start'],row['start']-11,-1)),axis=1)
df2

输出


    ID  loc start   range
0   A   Lon 20  [20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10]
1   B   Tok 10  [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
2   C   Ber 30  [30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20]
3   D   Ams 40  [40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30]
4   E   Rom 43  [43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33]

或者,如果需要逗号分隔:

df2['range'] = df2.apply(lambda row: ','.join([str(v) for v in range(row['start'],row['start']-11,-1)]),axis=1)

得到

    ID  loc start   range
0   A   Lon 20  20,19,18,17,16,15,14,13,12,11,10
1   B   Tok 10  10,9,8,7,6,5,4,3,2,1,0
2   C   Ber 30  30,29,28,27,26,25,24,23,22,21,20
3   D   Ams 40  40,39,38,37,36,35,34,33,32,31,30
4   E   Rom 43  43,42,41,40,39,38,37,36,35,34,33

相关问题 更多 >

    热门问题