如何使用None启动一个新列，并有条件地使用tuple更新其值？

2条回答

网友

1楼 · 编辑于 2024-05-16 10:08:22

由于您无法确定条件只返回一行或多行，因此最好创建一系列元组，并根据条件返回的行数重复该元组：

condition = df['points'] == 50
df.loc[condition, 'auditor'] = pd.Series([(1, 2)]).repeat(condition.sum()).values

print(df)

   points  time    year     month  points_h1 auditor
0    50.0  5:00  2010.0       NaN        NaN  (1, 2)
1    25.0  6:00     NaN  february        NaN    None
2    90.0  9:00     NaN   january        NaN    None
3     NaN   NaN     NaN      june       20.0    None

为了了解我的意思，让我们考虑第二行也有{{CD1>}为50：

d = [{'points': 50, 'time': '5:00', 'year': 2010},
 {'points': 50, 'time': '6:00', 'month': "february"},
 {'points': 90, 'time': '9:00', 'month': 'january'},
 {'points_h1': 20, 'month': 'june'}]
df = pd.DataFrame(d)
df['auditor'] = None
print(df,'\n\n')

condition = df['points'] == 50
df.loc[condition, 'auditor'] = pd.Series([(1, 2)]).repeat(condition.sum()).values
print(df)

   points  time    year     month  points_h1 auditor
0    50.0  5:00  2010.0       NaN        NaN    None
1    50.0  6:00     NaN  february        NaN    None
2    90.0  9:00     NaN   january        NaN    None
3     NaN   NaN     NaN      june       20.0    None 


   points  time    year     month  points_h1 auditor
0    50.0  5:00  2010.0       NaN        NaN  (1, 2)
1    50.0  6:00     NaN  february        NaN  (1, 2)
2    90.0  9:00     NaN   january        NaN    None
3     NaN   NaN     NaN      june       20.0    None

网友

2楼 · 编辑于 2024-05-16 10:08:22

您还可以使用np.where()，这是一个很好的条件函数：

df['auditor'] = np.where((df['points'] == 50), pd.Series([(1, 2)]), None)

或者在使用.assign()创建数据帧时，在一行中：

df = pd.DataFrame(d).assign(auditor=np.where((df['points'] == 50), pd.Series([(1, 2)]), None))

import pandas as pd, numpy as np
d = [{'points': 50, 'time': '5:00', 'year': 2010},
     {'points': 25, 'time': '6:00', 'month': "february"},
     {'points': 90, 'time': '9:00', 'month': 'january'},
     {'points_h1': 20, 'month': 'june'}]
df = pd.DataFrame(d).assign(auditor=np.where((df['points'] == 50), pd.Series([(1, 2)]), None))
df

Out[34]: 
   points  time    year     month  points_h1 auditor
0    50.0  5:00  2010.0       NaN        NaN  (1, 2)
1    25.0  6:00     NaN  february        NaN    None
2    90.0  9:00     NaN   january        NaN    None
3     NaN   NaN     NaN      june       20.0    None

根据您的评论，如果您想手动创建条件和结果，然后通过np.where()循环，那么您可以这样做：

import pandas as pd, numpy as np
d = [{'points': 50, 'time': '5:00', 'year': 2010},
     {'points': 25, 'time': '6:00', 'month': "february"},
     {'points': 90, 'time': '9:00', 'month': 'january'},
     {'points_h1': 20, 'month': 'june'}]
df = pd.DataFrame(d)

#Manually Set Conditions and Rsults
c1 = (df['points'] == 50)
r1 =  pd.Series([(1, 2)])
c2 = (df['points'] == 25)
r2 = pd.Series([(1, 3)])
conditions = [c1,c2]
results = [r1,r2]

df['auditor'] = None
for c, r in zip(conditions, results):
    df['auditor'] = np.where(c, r, df['auditor'])
df

Out[39]: 
   points  time    year     month  points_h1 auditor
0    50.0  5:00  2010.0       NaN        NaN  (1, 2)
1    25.0  6:00     NaN  february        NaN  (1, 3)
2    90.0  9:00     NaN   january        NaN    None

见Anky的评论。而不是：

df['auditor'] = None
    for c, r in zip(conditions, results):
        df['auditor'] = np.where(c, r, df['auditor'])

您可以使用np.select来避免循环。这是一个更像Python的游戏。做到这一点的有效方法：

df['auditor'] = np.select(conditions,results,None)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用None启动一个新列，并有条件地使用tuple更新其值？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >