如何为每月和每年绘制seaborn箱线图

2024-04-19 13:41:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个降水量值的时间序列数据框架

print(rain_df)
          date  precip
0   2017-01-10     0.0
1   2017-01-17     1.0
2   2017-01-24     1.0
3   2017-01-31     4.0
4   2017-02-07     1.0
..         ...     ...
218 2021-04-27     1.7
219 2021-05-03    22.7
220 2021-05-10     0.0
221 2021-05-17     2.0
222 2021-05-25     0.2
rain_df = rain_df.join(model_data['date'].dt.month.astype(str).str.get_dummies())
rain_df = rain_df.join(rain_df['date'].dt.year.astype(str).str.get_dummies())
rain_df = rain_df[rain_df['precip']>0]
rain_df.reset_index(inplace=True,drop=True)

print(rain_df)
          date  precip  1  10  11  12  2  3  4  5  6  7  8  9  2017  2018  \
0   2017-01-17     1.0  1   0   0   0  0  0  0  0  0  0  0  0     1     0   
1   2017-01-24     1.0  1   0   0   0  0  0  0  0  0  0  0  0     1     0   
2   2017-01-31     4.0  1   0   0   0  0  0  0  0  0  0  0  0     1     0   
3   2017-02-07     1.0  0   0   0   0  1  0  0  0  0  0  0  0     1     0   
4   2017-02-14    22.9  0   0   0   0  1  0  0  0  0  0  0  0     1     0   
..         ...     ... ..  ..  ..  .. .. .. .. .. .. .. .. ..   ...   ...   
175 2021-03-31    18.3  0   0   0   0  0  1  0  0  0  0  0  0     0     0   
176 2021-04-27     1.7  0   0   0   0  0  0  1  0  0  0  0  0     0     0   
177 2021-05-03    22.7  0   0   0   0  0  0  0  1  0  0  0  0     0     0   
178 2021-05-17     2.0  0   0   0   0  0  0  0  1  0  0  0  0     0     0   
179 2021-05-25     0.2  0   0   0   0  0  0  0  1  0  0  0  0     0     0   

     2019  2020  2021  
0       0     0     0  
1       0     0     0  
2       0     0     0  
3       0     0     0  
4       0     0     0  
..    ...   ...   ...  
175     0     0     1  
176     0     0     1  
177     0     0     1  
178     0     0     1  
179     0     0     1 

如何创建x轴为月-年,y轴为精度值的箱线图

这是我的尝试

# reverse one-hot encoding
rain_df['month-year'] = (rain_df.iloc[:, 2:] == 1).idxmax(1)

rain_df = rain_df.melt(id_vars='month-year',value_vars='precip', value_name='precip')

print(rain_df)
    month-year variable  precip
0            1   precip     1.0
1            1   precip     1.0
2            1   precip     4.0
3            2   precip     1.0
4            2   precip    22.9
..         ...      ...     ...
175          3   precip    18.3
176          4   precip     1.7
177          5   precip    22.7
178          5   precip     2.0
179          5   precip     0.2
ax=sn.boxplot(x='month-year', y='precip', hue='variable', data=rain_df, palette="Set3", linewidth=1)
ax.set_title('Joliette')
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

enter image description here

这里的问题是,它只在x轴上绘制月份,而没有给定年份的信息。我是不是把我的melt函数搞砸了


Tags: dfdatagetdatedtaxyearprint
3条回答
  • 我认为最简单的解决方案是使用^{}并将适当的.dt组件传递给xhue
  • 下面测试df中的'date'是一个datetime dtype
    • 将实数'date'转换为带有df.date = pd.to_datetime(df.date)datetime dtype

导入和测试数据帧

import pandas as pd
import seaborn as sns
from calendar import month_abbr as months  # optional
import numpy as np  # for test data

# test dataframe
np.random.seed(365)
rows = 250

dates = pd.bdate_range('2017-01-01', '2021-07-21', freq='D')
data = {'date': np.random.choice(dates, size=(rows)),
        'precip': np.random.randint(0, 31, size=(rows))}

df = pd.DataFrame(data)

# display(df.head())
        date  precip
0 2017-01-10     0.0
1 2017-01-17     1.0
2 2017-01-24     1.0
3 2017-01-31     4.0
4 2017-02-07     1.0
x轴上带月的

绘图

# get month names; optional step for renaming the xticklabels
months = list(months)[1:]

# now just plot the dateframe with seaborn
fig, ax = plt.subplots(figsize=(15, 7))

sns.boxplot(x=df.date.dt.month, y=df.precip, hue=df.date.dt.year, ax=ax)
ax.legend(title='Year', bbox_to_anchor=(1, 1), loc='upper left')
ax.set(xlabel='Month', xticklabels=months)  # setting the xticklabels is optional
plt.show

enter image description here

x轴上带有年份的

fig, ax = plt.subplots(figsize=(20, 7))

sns.boxplot(x=df.date.dt.year, y=df.precip, hue=df.date.dt.month, ax=ax)
ax.legend(title='Month', bbox_to_anchor=(1, 1), loc='upper left')
ax.set(xlabel='Year')
plt.show()

enter image description here

使用dt.strftime创建月份。例如:

>>> pd.to_datetime(pd.Series(['1918-11-11'])).dt.strftime('%b-%Y')
0    Nov-1918
dtype: object

rain_df['date']列上执行此操作并分配给month-year。如果这不起作用,您的数据可能不是datetime64格式。通过在调用.dt.strftime之前对其调用pd.to_datetime进行修复。使用新的month-year列,再次打印

试试这个,但我自己还没能测试。我有点不确定date的列类型melt将不是必需的

rain_df['month_year'] = rain_df['date'].apply(lambda x: x.strftime('%b %Y')) # e.g. Jul 2021

rain_df = rain_df[rain_df['precip'] > 0][['month_year', 'precip']] # df now consists of these two rows

ax = sn.boxplot(x='month_year', y='precip', data=rain_df, palette="Set3", linewidth=1)

ax.set_title('Joliette')
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))

相关问题 更多 >