在python中捕获活动的行直到第一个响应

senddate userid content response 2016-06-01 100 50505 NaN 2016-06-01 100 50505 NaN 2016-06-01 100 50505 1 2016-06-01 100 50505 1 2016-06-02 100 50505 NaN 2016-06-02 100 50505 1 2016-06-02 100 50505 1

senddate userid content response 2016-06-01 100 50505 NaN 2016-06-01 100 50505 NaN 2016-06-01 100 50505 1 2016-06-02 100 50505 NaN 2016-06-02 100 50505 1

1条回答

网友

1楼 · 发布于 2024-04-23 10:25:22

您可以使用^{}来实现这一点：

如果您groupby在“senddate”列上，我们可以生成一个布尔掩码，将索引与first_valid_index进行比较，这将创建一个多索引，其中第一级是日期，第二级是有值索引值，然后我们使用get_level_values为该级别检索这些值，并使用loc进行索引：

In [17]:
import pandas as pd
df = pd.read_csv(your_file_path)
df.loc[df.groupby('senddate')['response'].apply(lambda x: x[x.index <= x.first_valid_index()]).index.get_level_values(1)]

Out[17]:
    senddate  userid  content  response
0 2016-06-01     100    50505       NaN
1 2016-06-01     100    50505       NaN
2 2016-06-01     100    50505       1.0
4 2016-06-02     100    50505       NaN
5 2016-06-02     100    50505       1.0

上述各项的细目：

In [18]:
df.groupby('senddate')['response'].apply(lambda x: x[x.index <= x.first_valid_index()])

Out[18]:
senddate     
2016-06-01  0    NaN
            1    NaN
            2    1.0
2016-06-02  4    NaN
            5    1.0
Name: response, dtype: float64

In [19]:
df.groupby('senddate')['response'].apply(lambda x: x[x.index <= x.first_valid_index()]).index.get_level_values(1)

Out[19]:
Int64Index([0, 1, 2, 4, 5], dtype='int64')

相关问题更多 >

编程相关推荐

热门问题

热门文章