如何从Pandas数据框绘制多个折线图
我正在尝试从一个数据框中制作一组折线图,数据框的样子是这样的:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({ 'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min').dt.strftime('%H:%M:%S')
print(df)
CITY COUNT DAY TIME_BIN
0 ATLANTA 270 Wednesday 10:50:00
1 CHICAGO 375 Wednesday 12:20:00
2 MIAMI 490 Thursday 11:30:00
3 MIAMI 571 Sunday 23:30:00
4 DENVER 379 Saturday 07:30:00
... ... ... ... ...
9995 ATLANTA 107 Saturday 21:10:00
9996 DENVER 127 Tuesday 15:00:00
9997 DENVER 330 Friday 06:20:00
9998 PHOENIX 379 Saturday 19:50:00
9999 CHICAGO 628 Saturday 01:30:00
这是我现在的结果:
piv = df.pivot(columns="DAY").plot(x='TIME_BIN', kind="Line", subplots=True)
plt.show()
但是,x轴的格式有点乱,我需要每个城市都有自己的一条线。我该怎么解决这个问题呢?我在想,可能需要遍历每周的每一天,而不是试图在一行中制作一个数组。我试过用seaborn,但没有成功。总的来说,我想要实现的是:
- x轴是时间段(TIME_BIN)
- y轴是计数(COUNT)
- 每个城市用不同颜色的线表示
- 每天一个图表
相关问题:
- 暂无相关问题
1 个回答
3
我不明白这里为什么要用透视表,因为最后你需要把数据分成两次:第一次是按星期几分,这些数据会放到几个小图里;第二次是按城市分,每个城市会有自己颜色的线。到这一步,pandas在绘图方面的功能就有点有限了。
Matplotlib
使用matplotlib,你可以循环遍历这两个类别,星期和城市,然后直接把数据画出来。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates
df = pd.DataFrame({
'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday',
'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')
days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])
fig, axes = plt.subplots(nrows=len(days), figsize=(13,8), sharex=True)
# loop over days (one could use groupby here, but that would lead to days unsorted)
for i, day in enumerate(days):
ddf = df[df["DAY"] == day].sort_values("TIME_BIN")
# loop over cities
for city in cities:
dddf = ddf[ddf["CITY"] == city]
axes[i].plot(dddf["TIME_BIN"], dddf["COUNT"], label=city)
axes[i].margins(x=0)
axes[i].set_title(day)
fmt = matplotlib.dates.DateFormatter("%H:%M")
axes[-1].xaxis.set_major_formatter(fmt)
axes[0].legend(bbox_to_anchor=(1.02,1))
fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,right=0.85, hspace=0.8)
plt.show()
Seaborn
用seaborn的FacetGrid也能大致做到同样的效果。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates
import seaborn as sns
df = pd.DataFrame({
'CITY' : np.random.choice(['PHOENIX','ATLANTA','CHICAGO', 'MIAMI', 'DENVER'], 10000),
'DAY': np.random.choice(['Monday','Tuesday','Wednesday', 'Thursday',
'Friday', 'Saturday', 'Sunday'], 10000),
'TIME_BIN': np.random.randint(1, 86400, size=10000),
'COUNT': np.random.randint(1, 700, size=10000)})
df['TIME_BIN'] = pd.to_datetime(df['TIME_BIN'], unit='s').dt.round('10min')
days = ['Monday','Tuesday','Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
cities = np.unique(df["CITY"])
g = sns.FacetGrid(data=df.sort_values('TIME_BIN'),
row="DAY", row_order=days,
hue="CITY", hue_order=cities, sharex=True, aspect=5)
g.map(plt.plot, "TIME_BIN", "COUNT")
g.add_legend()
g.fig.subplots_adjust(left=0.05,bottom=0.05, top=0.95,hspace=0.8)
fmt = matplotlib.dates.DateFormatter("%H:%M")
g.axes[-1,-1].xaxis.set_major_formatter(fmt)
plt.show()