取非缺失d的最大值

2024-04-19 06:39:33 发布

您现在位置:Python中文网/ 问答频道 /正文


Tags: python
1条回答
网友
1楼 · 发布于 2024-04-19 06:39:33

您需要首先转换两列^{},因为混合值-datesnan。然后nan被转换成NaT

df.A = pd.to_datetime(df.A)
df.B = pd.to_datetime(df.B)
print (df)
           A          B
0        NaT 2016-01-01
1 2016-01-02        NaT
2        NaT 2016-01-03

print (df.max())

A   2016-01-02
B   2016-01-03
dtype: datetime64[ns]

print (df.max(axis=1))
0   2016-01-01
1   2016-01-02
2   2016-01-03
dtype: datetime64[ns]

具有^{}list comprehension的更动态解:

df[['A','B','C']] = pd.concat([pd.to_datetime(df[col]) for col in df[['A','B','C']]], axis=1)
print (df)
           A          B
0        NaT 2016-01-01
1 2016-01-02        NaT
2        NaT 2016-01-03

或者使用apply

df[['A','B','C']] = df[['A','B','C']].apply(pd.to_datetime)

时间安排:

In [28]: %timeit (c(df2))
100 loops, best of 3: 4.55 ms per loop

In [29]: %timeit (b(df1))
100 loops, best of 3: 12.8 ms per loop

In [30]: %timeit (a(df))
100 loops, best of 3: 12.8 ms per loop

计时代码

df = pd.DataFrame({"A": [np.nan, 
                         datetime.date(2016, 1, 2), 
                         np.nan], 
                   "B": [datetime.date(2016, 1, 1), 
                         np.nan, 
                         datetime.date(2016, 1, 3)],
                     "C": [datetime.date(2016, 1, 1), 
                     np.nan, 
                     datetime.date(2016, 1, 3)]
                   })

print (df)
#[300000 rows x 3 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
df1 = df.copy()
df2 = df.copy()

def a(df):
    df[['A','B','C']] = pd.concat([pd.to_datetime(df[col]) for col in df[['A','B','C']]], axis=1)
    return df

def b(df):
    df[['A','B','C']] = df[['A','B','C']].apply(pd.to_datetime)
    return df

def c(df):
    df.A = pd.to_datetime(df.A)
    df.B = pd.to_datetime(df.B)
    df.C = pd.to_datetime(df.C)
    return df

相关问题 更多 >