如何在Pyechart或其他文件中显示Pandas的每日直方图?

2024-05-19 00:22:36 发布

您现在位置:Python中文网/ 问答频道 /正文

根据这个question,我想得到7月份按id划分的项目在总金额中的比例,我使用与问题相同的数据集:

    id       date  num     name  type price
0    1   7/6/2020   10      pen  abcd    $1
1    1   7/6/2020    2      abc   efg    $3
2    1   7/6/2020    3      bcd   efg    $5
3    2   7/6/2020    3      pen  abcd    $1
4    2   7/6/2020    1   pencil  abcd    $3
5    2   7/6/2020    2     disk  abcd    $1
6    2   7/6/2020    2    paper  abcd    $1
7    3   7/6/2020    2       ff   pag  $100
8    3   7/6/2020   10    water   kml    $5
9    4  7/15/2020    5       gg   kml    $5
10   4  7/15/2020   10  cofffee    oo    $5
11   5  7/15/2020    5       pp    oo    $4
12   6  7/15/2020    2      abc   efg    $3
13   6  7/15/2020    3      bcd   efg    $5
14   6  7/15/2020    4       aa   efg    $5
15   6  7/15/2020    5       bb   efg    $6
16   7  7/15/2020    1      bag  abcd   $50
17   7  7/15/2020    1      box  abcd   $20
18   8  7/15/2020    1   pencil  abcd    $3
19   8  7/15/2020    2     disk  abcd    $1
20   8  7/15/2020    2    paper  abcd    $1
21   8  7/15/2020    2       ff  hijk  $100
22   9  8/15/2020   10    water   kml    $5
23   9  8/15/2020    5       gg   kml    $5
24   9  8/15/2020   10  cofffee    oo    $5
25   9  8/15/2020    5       pp    oo    $4
26   9  8/15/2020    2      abc   efg    $3
27  10  8/15/2020    3      bcd   efg    $5
28  10  8/15/2020    4       aa   efg    $5
29  10  8/15/2020    5       bb   efg    $6
30  11  8/15/2020    1      bag  abcd   $50
31  11  8/15/2020    1      box  abcd   $20

我想用Pyechart或其他类型显示总金额的每日直方图,类似于this screenshot,下面的代码不正确

import pandas as pd
import xlrd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.read_excel ('./orders.xlsx', sheet_name='Sheet1')
df.groupby(by=['type']).sum()


df['price'] = df['price'].replace('$','', regex=True).astype(int)
df['new'] = df['price'].mul(df['num'])

df1 = df.groupby(by=['name'], as_index=False)['new'].sum()

# df1
# df1['new'] = df1.apply(lambda x: x.sum(), axis=1)
# df1.loc['new'] = df1.apply(lambda x: x.sum()).dropna()

非常感谢你的建议


Tags: nameimportiddfnewaskmlprice
1条回答
网友
1楼 · 发布于 2024-05-19 00:22:36

首先,我建议使用datetime类型来处理日期/时间:

df['date'] = pd.to_datetime(df['date'])

现在,为了回答您的问题,如果您只需要7月份的数据,可以使用以下方法提取:

July_df = df[df['date'].dt.to_period('M')=='2020-07'].copy()

您可以继续绘制July_df

如果要为每个月绘图,可以使用groupby

df['total']=df['price'].str.replace('$','').astype(float)*df['num']

(df.groupby([pd.Grouper(key='date',freq='M'),'name'])['total'].sum()
   .reset_index(level='date')
   .groupby('date')
   .plot.pie(subplots=True, autopct='%.2f%%')
)

你会得到两个这样的图:

enter image description here

enter image description here

如果迭代groupby,还可以添加更多格式:

# notice the difference in first groupby
groups = (df.groupby([df.date.dt.strftime('%b-%Y'),'name'])['total'].sum()
   .reset_index(level='date')
   .groupby('date')
)
fig, axes = plt.subplots(1,2, figsize=(10,5))
for ax, (month, data) in zip(axes, groups):
    data['total'].plot.pie(autopct='%.2f%%', ax=ax)
    ax.set_title(f'data in {month}')

输出:

enter image description here

相关问题 更多 >

    热门问题