如何在循环使用datafram时向绘图添加数据

2024-04-24 16:42:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一些临床数据,其中包含多个受试者多次就诊的值。我创建了一个脚本来循环并为每个主题创建一个包含每次访问值的绘图。现在,我需要为每个主题图添加数据:

  1. 对于每个受试者,添加一个新标记(星号)以仅标识基线值(bcva_OS和bcva_OD)。我只能让它显示所有值的标记。如何仅为基线创建子集?参见代码中的注释。如果使用以下命令,则会出现语法错误:

    plt.plot_date(sub_df['visit_date'] if sub_df[sub_df.visit_label == 'Visit 2 - Baseline'],

  2. 对于每个主题,如何添加一个全新的数据类型,以便两种数据类型都覆盖在每个主题的绘图上?我想我可以用一个对象的数据来做,但是循环。。。

示例代码:

for subject, sub_df in new_od_df.groupby(by='subject'):

    # Plot fellow eye
    plt.plot(sub_df['visit_date'], sub_df['bcva_OS'], marker='^', 
        label='OS (fellow) ', color=sns.xkcd_rgb['pale red'])

    # Plot treated eye
    plt.plot(sub_df['visit_date'], sub_df['bcva_OD'], marker='o',
        label='OD (treated) ', color=sns.xkcd_rgb['denim blue']) 

    # Trying to plot only the baseline values
    #plt.plot_date(sub_df['visit_date'] if sub_df[sub_df.visit_label == 'Visit 2 - Baseline'], 

    # Plot fellow eye
    plt.plot_date(sub_df['visit_date'], sub_df['bcva_OS'], 
        marker='*', markersize=10,
        label='BL (fellow) ', color=sns.xkcd_rgb['light pink'])

    # Plot treated eye
    plt.plot_date(sub_df['visit_date'], sub_df['bcva_OD'], 
        marker='*', markersize=10,
        label='BL (treated) ', color=sns.xkcd_rgb['baby blue'])

    # Legend the old way
    plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0)

    # Display each chart separately
    plt.show()

样本数据:

       subject treated_eye              visit_label  visit_date  bcva_OD  bcva_OS         refract_OD         refract_OS
index                                                                                                                  
108       1101          OD      Visit 1 - Screening  2016-01-07     27.0     41.0    + 5 + 0.75 X 27    + 5 + 1.75 X 45
115       1101          OD       Visit 2 - Baseline  2016-01-25     35.0     41.0    + 5 + 0.75 X 27  + 5.5 + 1.75 X 40
120       1101          OD  Baseline - VA Session 2  2016-01-25     35.0     41.0    + 5 + 0.75 X 27  + 5.5 + 1.75 X 40
125       1101          OD          Visit 4 - Day 1  2016-02-02     32.0     42.0    + 5 + 0.75 X 27    + 5 + 1.75 X 30
123       1101          OD          Visit 5 - Day 7  2016-02-08     40.0     43.0    + 5 + 0.75 X 28    + 5 + 1.75 X 30
111       1101          OD         Visit 6 - Day 14  2016-02-16     33.0     44.0    + 5 + 0.75 X 27    + 5 + 1.75 X 40
124       1101          OD              Unscheduled  2016-02-24     37.0     44.0  + 4.5 + 1.25 X 30    + 5 + 1.75 X 40
118       1101          OD        Visit 7 - Month 1  2016-02-29     37.0     40.0  + 4.5 + 1.25 X 30    + 5 + 1.75 X 43

样地:

Sample plot


Tags: 数据df主题dateplotospltvisit
1条回答
网友
1楼 · 发布于 2024-04-24 16:42:09

注意:这是对第1点的部分回答:

我不确定我是否完全理解您的请求,特别是关于第2点:创建新的数据类型。请编辑您的问题,使第2点更清楚。现在我猜你想在基线减法后绘制OD和OS值,对吗?你知道吗

关于点1,下面的解决方案正确地获取基线值并将其绘制为虚线。注意,在使用fig,ax=plt.subplots()正确创建图形之后,我还添加了一个绘图标题,并将对plt.的调用更改为ax.。这可能会在以后派上用场,并且已经是fig.autofmt_xdate()所必需的。你知道吗

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.style.use('ggplot')
import seaborn as sns

data="""index,subject,treated_eye,visit_label,visit_date,bcva_OD,bcva_OS,refract_OD,refract_OS
108, 1101,    OD,  Visit 1 - Screening, 2016-01-07,  27.0,   41.0,    + 5 + 0.75 X 27, + 5 + 1.75 X 45
115, 1101,    OD,  Visit 2 - Baseline,  2016-01-25,  35.0,   41.0,    + 5 + 0.75 X 27, + 5.5 + 1.75 X 40
120, 1101,    OD,  Baseline - VA Session 2 ,2016-01-25, 35.0,   41.0,    + 5 + 0.75 X 27, + 5.5 + 1.75 X 40
125, 1101,    OD,  Visit 4 - Day 1 ,2016-02-02, 32.0,   42.0,    + 5 + 0.75 X 27, + 5 + 1.75 X 30
123, 1101,    OD,  Visit 5 - Day 7 ,2016-02-08, 40.0,   43.0,    + 5 + 0.75 X 28, + 5 + 1.75 X 30
111, 1101,    OD,  Visit 6 - Day 14    ,2016-02-16,33.0,   44.0,    + 5 + 0.75 X 27, + 5 + 1.75 X 40
124, 1101,    OD,  Unscheduled ,2016-02-24, 37.0,   44.0,    + 4.5 + 1.25 X 30,   + 5 + 1.75 X 40
118, 1101,    OD,  Visit 7 - Month 1 ,  2016-02-29 , 37.0,   40.0,    + 4.5 + 1.25 X 30,   + 5 + 1.75 X 43
"""

## DataFrame cleanup
df=pd.read_csv(pd.compat.StringIO(data),sep=",",index_col=0)
df_obj = df.select_dtypes(['object'])
df[df_obj.columns] = df_obj.apply(lambda x: x.str.strip())

df['visit_date']=pd.to_datetime(df['visit_date'])

for subject, sub_df in df.groupby(by='subject'):
    mask=(sub_df.visit_label == 'Visit 2 - Baseline')
    bcva_OS_baseline=sub_df['bcva_OS'][mask].values
    bcva_OD_baseline=sub_df['bcva_OD'][mask].values

    fig,ax=plt.subplots()    

    # Plot fellow eye
    ax.plot(sub_df['visit_date'], sub_df['bcva_OS'], marker='^', 
        label='OS (fellow) ', color=sns.xkcd_rgb['pale red'])

    # Plot treated eye
    ax.plot(sub_df['visit_date'], sub_df['bcva_OD'], marker='o',
        label='OD (treated) ', color=sns.xkcd_rgb['denim blue']) 

    # Plot fellow eye
    ax.plot_date(sub_df['visit_date'], sub_df['bcva_OS'], 
        marker='*', markersize=10,
        label='BL (fellow) ', color=sns.xkcd_rgb['light pink'])

    # Plot treated eye
    ax.plot_date(sub_df['visit_date'], sub_df['bcva_OD'], 
        marker='*', markersize=10,
        label='BL (treated) ', color=sns.xkcd_rgb['baby blue'])

    # Plot baseline
    ax.axhline(bcva_OS_baseline,color=sns.xkcd_rgb['pale red'],linestyle="dashed")
    ax.axhline(bcva_OD_baseline,color=sns.xkcd_rgb['denim blue'],linestyle="dashed")

    # Legend the old way
    ax.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0)

    # Display each chart separately
    ax.set_title('subject {0}'.format(subject))
    fig.autofmt_xdate()
    plt.tight_layout()
    plt.show()

结果:Plot result

相关问题 更多 >