Pandas:将带有if/else条件的for循环转换为apply方法(lambda函数)

2024-06-09 15:16:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用for循环具有以下函数:

def add_CQI_iterrows(df):
    previous_row = df['Date'].astype(str)[0]
    CQI_index = 0
    series = []

    for index, row in df.iterrows():
        if row['Date'] == previous_row:       
            previous_row = row['Date']
            print(CQI_index)
        else:
            CQI_index += 1
            previous_row = row['Date']      
        series.append(CQI_index)
    df['CQI'] = series
    
    return df

我想找到一种方法将这个for循环转换为apply方法。类似这样的东西(不起作用):

def add_CQI_apply(df):
    previous_row = df['Date'].astype(str)[0]
    CQI_index = 1
    series = []
    
    df['CQI'] = df.apply(lambda row: previous_row = row['Date'] if row['Date'] == previous_row else CQI_index += 1 and previous_row = row['Date'], axis=1)
    
    return df

我想做这个转换,因为我想看看apply方法有多快,以及是否可以对Pandas系列进行apply方法的矢量化

这是我的数据(data.json):

[
 {
   "Date": "9/20/2020 8:50",
   "UE": 1
 },
 {
   "Date": "9/20/2020 8:50",
   "UE": 2
 },
 {
   "Date": "9/20/2020 8:50",
   "UE": 3
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 1
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 8
 },
 {
   "Date": "9/20/2020 8:57",
   "UE": 2
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 1
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 5
 },
 {
   "Date": "9/20/2020 9:12",
   "UE": 3
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 1
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 4
 },
 {
   "Date": "9/20/2020 9:20",
   "UE": 3
 }
]

最后,这里是上载此数据的函数:

def upload_data(file):
    df = pd.read_json(file)
    df['Date'] = pd.to_datetime(df['Date'], format="%Y-%d-%m %H:%M:%S") 
    df['CQI'] = np.nan
    return df

Tags: 方法函数adddffordateindexreturn
1条回答
网友
1楼 · 发布于 2024-06-09 15:16:01

df['CQI'] = (df['Date'] != df['Date'].shift()).cumsum()

In [120]: (df['Date'] != df['Date'].shift()).cumsum()
Out[120]:
0     1
1     1
2     1
3     2
4     2
5     2
6     3
7     3
8     3
9     4
10    4
11    4
Name: Date, dtype: int64

相关问题 更多 >