当为另一列指定一个条件时,如何填充正向值?

2024-04-20 02:45:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在分析下面的一张表。当事件列显示transaction时,offer_id列中有几个None值。仅当前一个事件是offer viewed时,我才希望用值forward填充None,否则,将None值填充为0或保留为None

数据帧是:

df = pd.DataFrame({'event': ['offer_received', 'offer_viewed','transaction', 'transaction', 'offer_received', 'transaction'], 'user':['A','A','A','A','A','A'], 'value':[0, 0, 1.09, 2.55, 0, 3.02], 'offer_id': ['0b1e1539f2cc45b7b9fa7c272da2e1d7', '0b1e1539f2cc45b7b9fa7c272da2e1d7', 'None', 'None', '3f207df678b143eea3cee63160fa8bed', 'None'], 'days':[0, 0.25, 9.75, 11, 0,9.75]})
event           user   value    offer_id                            days
offer received  A      0.00     0b1e1539f2cc45b7b9fa7c272da2e1d7    0.00
offer viewed    A      0.00     0b1e1539f2cc45b7b9fa7c272da2e1d7    0.25
transaction     A      1.09     None                                9.75
transaction     A      2.55     None                                11
offer received  A      0.00     3f207df678b143eea3cee63160fa8bed    0.00
transaction     A      3,02     None                                9.75

我试过用df.offer_id.fillna(method = 'ffill'),但我就是用不上´当前一个事件是offer_viewed时,我不知道如何将条件放在事件列上,然后使用(method = 'ffill')填充offer_idtransaction

我的预期结果如下:

event           user   value    offer_id                            days
offer received  A      0.00     0b1e1539f2cc45b7b9fa7c272da2e1d7    0.00
offer viewed    A      0.00     0b1e1539f2cc45b7b9fa7c272da2e1d7    0.2
transaction     A      1.09     0b1e1539f2cc45b7b9fa7c272da2e1d7    9.75                   
transaction     A      2.55     0b1e1539f2cc45b7b9fa7c272da2e1d7    11                         
offer received  A      0.00     3f207df678b143eea3cee63160fa8bed    0.00
transaction     A      3,02     None                                9.75


Tags: noneeventiddfvalue事件daysmethod
1条回答
网友
1楼 · 发布于 2024-04-20 02:45:20

我想你可以通过shift()ffill()where()达到目的:

df = pd.DataFrame({'e': ['r', 'v', 't', 'r', 't'], 'oid': [1, 1, np.nan, 2, np.nan]})
df
#    e  oid
# 0  r  1.0
# 1  v  1.0
# 2  t  NaN
# 3  r  2.0
# 4  t  NaN
df.oid = df.oid.ffill().where(df.e.shift() == 'v', df.oid)
df
#    e  oid
# 0  r  1.0
# 1  v  1.0
# 2  t  1.0
# 3  r  2.0
# 4  t  NaN

您甚至可以跳过ffill()并使用shift()两次:

df = pd.DataFrame({'e': ['r', 'v', 't', 'r', 't'], 'oid': [1, 1, np.nan, 2, np.nan]})

df.oid = df.oid.shift().where(df.e.shift() == 'v', df.oid)
df
#    e  oid
# 0  r  1.0
# 1  v  1.0
# 2  t  1.0
# 3  r  2.0
# 4  t  NaN

相关问题 更多 >