对数据框分组并为每个组迭代子图

-4 投票
1 回答
61 浏览
提问于 2025-04-13 02:34

我有一个数据框(dataframe)。我想先按照“Main”这一列进行分组(这样会生成两个数据框,分别叫做df_M1和df_M2),然后为每个生成的数据框创建子图。接下来,在第二步中,我会对每个数据框(df_M1和df_M2)按照“Sub”这一列再进行分组,这样会生成四个数据框,分别叫做df_S1、df_S2、df_S3和df_S4。最后,在第三步中,我会遍历每个子数据框的两列(col1和col2),把它们绘制到其中一个图上。实际上,第一步生成两个数据框,第二步则为每个第一步生成的数据框再生成四个数据框。所以我想为df_M1和df_M2各创建一组子图,每组子图里有四个图(2*2),而每个子图对应df_S1、df_S2、df_S3和df_S4。每个图里包含col1和col2的折线图。日期这一列作为X轴,col1和col2的值则在Y轴上展示。

我写了以下的脚本,但在定义子图以绘制折线图时遇到了问题。所以,有谁能告诉我如何为每个子数据框定义子图(坐标轴),并在每个图上绘制col1和col2的内容吗?

创建一个示例数据框

data = {'Date': [2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001],
        'Main': ['A','A','A','A','A','A','A','A','B','B','B','B','B','B','B','B'],
       'Sub' : ['A1','A1','A2','A2','A3','A3','A4','A4','B1','B1','B2','B2','B3','B3','B4','B4'],
       'col1' : [1,2,4,5,1,2,6,4,8,5,7,2,4,5,1,2],
       'col2' : [5,6,1,4,5,4,5,1,5,4,5,6,4,5,8,4]}

df = pd.DataFrame(data)
df_M = [x for _, x in df.groupby(['Main'])]
for i in df_M:
    fig, axes = plt.subplots(nrows=2, ncols=2,figsize=(12,6), 
sharex=True,linewidth=1, edgecolor='black')
    df_S = [z for _, z in i.groupby(['Sub'])]
    for j in df_S:
        for col in j.columns.values[3:5]:
            ax.plot(j[col])
            plt.show()

1 个回答

1

你需要在 axes 对象中处理每个坐标轴的属性,也就是说 axes[0,0] 是指在你2x2的子图中第一行第一列的那个坐标轴。

编辑 添加了日期作为x轴

import matplotlib.pyplot as plt
import pandas as pd

data = {
    "Date": [2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,2000,2001,],
    "Main": ["A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B",],
    "Sub": ["A1","A1","A2","A2","A3","A3","A4","A4","B1","B1","B2","B2","B3","B3","B4","B4",],
    "col1": [1, 2, 4, 5, 1, 2, 6, 4, 8, 5, 7, 2, 4, 5, 1, 2],
    "col2": [5, 6, 1, 4, 5, 4, 5, 1, 5, 4, 5, 6, 4, 5, 8, 4],
}

df = pd.DataFrame(data)
df_M = [x for _, x in df.groupby(["Main"])]


for groupName, groupdf in df.groupby(["Main"]):
    fig, axes = plt.subplots(
        nrows=2, ncols=2, figsize=(12, 6), sharex=True, linewidth=1, edgecolor="black"
    )

    for idx, (subGroupName, subGroupdf) in enumerate(groupdf.groupby(["Sub"])):
        row = 0 if idx < 2 else 1
        col = idx % 2
        for plottingCol in subGroupdf.columns.values[3:5]:
            axes[row, col].plot(subGroupdf.Date, subGroupdf[plottingCol])

    plt.show()

撰写回答