所以,我想知道如何用条件来划分一列。我的想法是研究用户的活动,但为此我需要设置一个条件。 我有数据帧:
df = pd.DataFrame({'User': ["juan","juan","juan","juan","petter","petter","petter","petter","petter","petter","petter","petter","ana","ana","ana","ana","raul","raul","raul","raul"],
'time': ["2/1/2019","3/1/2019","4/1/2019","6/1/2019","2/1/2019","5/1/2019","6/1/2019","10/1/2019","11/1/2019","12/1/2019","13/1/2019","14/1/2019","8/1/2019","10/1/2019","15/1/2019","20/1/2019","15/1/2019","17/1/2019","18/1/2019","19/1/2019"],
'activity': ["fly", "hotel","car","jump","fly", "hotel","jump","car","fly", "car","hotel","car","car", "hotl","car","hotel","fly", "hotel","car","car"],
'%timeper_user': ["4 days","4 days","4 days","4 days","8 days","8 days","8 days","8 days","3 days","3 days","3 days","3 days","12 days","12 days","12 days","12 days","4 days","4 days","4 days","4 days"]})
正如您将看到的,每个用户都有一个列(time),每个用户都有另一个列(%timeper\u user)。然后是一个列(activity),它是每个用户在一段时间内执行的活动。我们的想法是做一个“条件分割”,将每个活动放在不同的列中。动作1,动作2,动作3,动作3。但是当用户在时间之外执行活动时(time+%timeper\u user),请将活动放在不同的列中,例如:Act21、act22、act23、Act24我希望它是这样的:
df2 = pd.DataFrame({'User': ["juan","petter","ana","raul"],
"act1":["fly","fly","car","fly"],
"act2":["hotel","hotel","hotel","hotel"],
"act3":["car","jump","car","car"],
"act4":["jump","car","hotel","car"],
"actn":["","","",""],
"act21":["","fly","",""],
"act22":["","car","",""],
"act23":["","hotel","",""],
"act24":["","car","",""]})
(DF2)是我想要的输出吗 查看用户Petter超过时间(2/1/2019+8天)=10/1/2019。因此,从2019年11月1日起,活动被安排在Act21、Act22、Act23、Act24中。 我有很多用户,所以我不知道如何做一个函数来执行这一点,并采取所有(用户对用户)。如果你能帮助我,我将非常感激。谢谢
我的想法是。如果用户在范围(时间+每个用户的时间百分比)之间创建事件,则表示所有活动都属于活动范围1(act1、Act12、Act13、ACT14)。如果日期更大,则表示用户将输入activity2(act21、act22、act23、act24)。简单地说。。。如果佩特从美国去马德里,他可能会去酒店,租车,并尝试飞行。但在那里,当他回到美国,皮特可能会购买第二次飞行(这将进入范围活动2(第21幕,第22幕,第23幕,第24幕)。如果你运行df。。。是我的数据帧。。df2是我想要的数据
相关问题 更多 >
编程相关推荐