有Pandas的日子

2024-05-13 17:36:03 发布

您现在位置:Python中文网/ 问答频道 /正文

如果我使用这个函数pd.DatetimeIndex(dfTrain['datetime']).weekday我会得到一天的数字,但是我找不到任何函数来命名de day。。。所以我需要将0转换为周一,1转换为周二,依此类推。

以下是我的数据帧示例:

            datetime    season holiday workingday weather   temp    atemp   humidity    windspeed   count
    0   2011-01-01 00:00:00 1   0   0   1   9.84    14.395  81  0.0000  16
    1   2011-01-01 01:00:00 1   0   0   1   9.02    13.635  80  0.0000  40
    2   2011-01-01 02:00:00 1   0   0   1   9.02    13.635  80  0.0000  32
    3   2011-01-01 03:00:00 1   0   0   1   9.84    14.395  75  0.0000  13
    4   2011-01-01 04:00:00 1   0   0   1   9.84    14.395  75  0.0000  1
    5   2011-01-01 05:00:00 1   0   0   2   9.84    12.880  75  6.0032  1
    6   2011-01-01 06:00:00 1   0   0   1   9.02    13.635  80  0.0000  2
    7   2011-01-01 07:00:00 1   0   0   1   8.20    12.880  86  0.0000  3
    8   2011-01-01 08:00:00 1   0   0   1   9.84    14.395  75  0.0000  8
    9   2011-01-01 09:00:00 1   0   0   1   13.12   17.425  76  0.0000  14

还有一个问题,那就是pandas.DatetimeIndex.dayofweekpandas.DatetimeIndex.weekday之间的区别?


Tags: 数据函数示例pandasdatetimede数字命名
3条回答

在版本0.18.1中,可以使用新方法^{}

df['weekday'] = df['datetime'].dt.weekday_name
print df
             datetime  season  holiday  workingday  weather   temp   atemp  \
0 2011-01-01 00:00:00       1        0           0        1   9.84  14.395   
1 2011-01-01 01:00:00       1        0           0        1   9.02  13.635   
2 2011-01-01 02:00:00       1        0           0        1   9.02  13.635   
3 2011-01-01 03:00:00       1        0           0        1   9.84  14.395   
4 2011-01-01 04:00:00       1        0           0        1   9.84  14.395   
5 2011-01-01 05:00:00       1        0           0        2   9.84  12.880   
6 2011-01-01 06:00:00       1        0           0        1   9.02  13.635   
7 2011-01-01 07:00:00       1        0           0        1   8.20  12.880   
8 2011-01-01 08:00:00       1        0           0        1   9.84  14.395   
9 2011-01-01 09:00:00       1        0           0        1  13.12  17.425   

   humidity  windspeed  count   weekday  
0        81     0.0000     16  Saturday  
1        80     0.0000     40  Saturday  
2        80     0.0000     32  Saturday  
3        75     0.0000     13  Saturday  
4        75     0.0000      1  Saturday  
5        75     6.0032      1  Saturday  
6        80     0.0000      2  Saturday  
7        86     0.0000      3  Saturday  
8        75     0.0000      8  Saturday  
9        76     0.0000     14  Saturday  

一种方法,只要datetime已经是datetime列,就是应用datetime.strftime来获取工作日的字符串:

In [105]:

df['weekday'] = df[['datetime']].apply(lambda x: dt.datetime.strftime(x['datetime'], '%A'), axis=1)
df
Out[105]:
             datetime  season  holiday  workingday  weather   temp   atemp  \
0 2011-01-01 00:00:00       1        0           0        1   9.84  14.395   
1 2011-01-01 01:00:00       1        0           0        1   9.02  13.635   
2 2011-01-01 02:00:00       1        0           0        1   9.02  13.635   
3 2011-01-01 03:00:00       1        0           0        1   9.84  14.395   
4 2011-01-01 04:00:00       1        0           0        1   9.84  14.395   
5 2011-01-01 05:00:00       1        0           0        2   9.84  12.880   
6 2011-01-01 06:00:00       1        0           0        1   9.02  13.635   
7 2011-01-01 07:00:00       1        0           0        1   8.20  12.880   
8 2011-01-01 08:00:00       1        0           0        1   9.84  14.395   
9 2011-01-01 09:00:00       1        0           0        1  13.12  17.425   

   humidity  windspeed  count   weekday  
0        81     0.0000     16  Saturday  
1        80     0.0000     40  Saturday  
2        80     0.0000     32  Saturday  
3        75     0.0000     13  Saturday  
4        75     0.0000      1  Saturday  
5        75     6.0032      1  Saturday  
6        80     0.0000      2  Saturday  
7        86     0.0000      3  Saturday  
8        75     0.0000      8  Saturday  
9        76     0.0000     14  Saturday  

至于你的另一个问题,在dayofweekweekday之间没有区别。

将工作日的映射定义为字符串等价项并在工作日调用映射会更快:

dayOfWeek={0:'Monday', 1:'Tuesday', 2:'Wednesday', 3:'Thursday', 4:'Friday', 5:'Saturday', 6:'Sunday'}
df['weekday'] = df['datetime'].dt.dayofweek.map(dayOfWeek)

对于0.15.0之前的版本,以下操作应该有效:

import datetime as dt
df['weekday'] = df['datetime'].apply(lambda x: dt.datetime.strftime(x, '%A'))

版本0.18.1及更高版本

现在有一个新的方便方法^{}来执行上述操作

版本0.23.0及更高版本

工作日名称现在被删除,改为^{}

使用dt.weekday_namedeprecated since ^{},而使用^{}

df.datetime.dt.day_name()

0    Saturday
1    Saturday
2    Saturday
3    Saturday
4    Saturday
5    Saturday
6    Saturday
7    Saturday
8    Saturday
9    Saturday
Name: datetime, dtype: object

相关问题 更多 >