表一
tableone的Python项目详细描述
TableOne是为患者创建“Table 1”摘要统计信息的包 人口。它的灵感来自吉田和 波恩。
文档
文档可在readthedocs上找到。可以在GitHub as a Jupyter Notebook上对包进行可执行的演示。
建议引用
如果您在学习中使用TableOne,请引用以下论文:
Tom J Pollard, Alistair E W Johnson, Jesse D Raffa, Roger G Mark; tableone: An open source Python package for producing summary statistics for research papers, JAMIA Open, Volume 1, Issue 1, 1 July 2018, Pages 26–31, https://doi.org/10.1093/jamiaopen/ooy012
从以下位置下载bibtex文件:https://academic.oup.com/jamiaopen/downloadcitation/5001910?format=bibtex
tableone
用户注意事项虽然我们尝试使用最佳实践来创建这个包,但如果没有监督,即使是基本统计任务的自动化也可能是不健全的。我们鼓励使用表一和其他描述性统计方法,特别是可视化方法,以确保适当的数据处理。
提供有关摘要统计信息的详细指导超出了文档的范围,但作为入门,我们在documentation中创建摘要表时提供了一些选择参数的注意事项。
在使用“tableone”进行研究时,尤其是在提交研究报告供发表之前,应向统计学家寻求指导。
安装
要使用pip安装软件包,请运行:
pip install tableone
要使用conda安装此软件包,请运行:
conda install -c conda-forge tableone
示例
导入库:
from tableone import TableOne import pandas as pd
将示例数据加载到pandas数据框:
url="https://raw.githubusercontent.com/tompollard/data/master/primary-biliary-cirrhosis/pbc.csv" data=pd.read_csv(url)
(可选)要包含在表1中的列列表:
columns = ['age','bili','albumin','ast','platelet','protime', 'ascites','hepato','spiders','edema','sex', 'trt']
(可选)包含分类变量的列列表:
categorical = ['ascites','hepato','edema','sex','spiders','trt']
可选地,用于分层的分类变量和非正态变量列表:
groupby = 'trt' nonnormal = ['bili']
使用输入参数创建TableOne的实例:
mytable = TableOne(data, columns, categorical, groupby, nonnormal)
在解释器中键入实例的名称:
mytable
…将下表打印到屏幕:
Stratified by trt 1.0 2.0 isnull --------------------- ----------------- ----------------- -------- n 158 154 106 time (mean (std)) 2015.62 (1094.12) 1996.86 (1155.93) 0 age (mean (std)) 51.42 (11.01) 48.58 (9.96) 0 bili (median [IQR]) 1.40 [0.80,3.20] 1.30 [0.72,3.60] 0 chol (mean (std)) 365.01 (209.54) 373.88 (252.48) 134 albumin (mean (std)) 3.52 (0.44) 3.52 (0.40) 0 copper (mean (std)) 97.64 (90.59) 97.65 (80.49) 108 alk.phos (mean (std)) 2021.30 (2183.44) 1943.01 (2101.69) 106 ast (mean (std)) 120.21 (54.52) 124.97 (58.93) 106 trig (mean (std)) 124.14 (71.54) 125.25 (58.52) 136 platelet (mean (std)) 258.75 (100.32) 265.20 (90.73) 11 protime (mean (std)) 10.65 (0.85) 10.80 (1.14) 2 status (n (%)) 0 0 83 (52.53) 85 (55.19) 1 10 (6.33) 9 (5.84) 2 65 (41.14) 60 (38.96) ascites (n (%)) 106 0.0 144 (91.14) 144 (93.51) 1.0 14 (8.86) 10 (6.49) hepato (n (%)) 106 0.0 85 (53.80) 67 (43.51) 1.0 73 (46.20) 87 (56.49) spiders (n (%)) 106 0.0 113 (71.52) 109 (70.78) 1.0 45 (28.48) 45 (29.22) edema (n (%)) 0 0.0 132 (83.54) 131 (85.06) 0.5 16 (10.13) 13 (8.44) 1.0 10 (6.33) 10 (6.49) stage (n (%)) 6 1.0 12 (7.59) 4 (2.60) 2.0 35 (22.15) 32 (20.78) 3.0 56 (35.44) 64 (41.56) 4.0 55 (34.81) 54 (35.06) sex (n (%)) 0 f 137 (86.71) 139 (90.26) m 21 (13.29) 15 (9.74)
通过将pval参数设置为true来计算p值
mytable = TableOne(data, columns, categorical, groupby, nonnormal, pval=True)
…哪个打印:
Stratified by trt 1.0 2.0 isnull pval testname --------------------- ----------------- ----------------- -------- ------ -------------- n 158 154 106 time (mean (std)) 2015.62 (1094.12) 1996.86 (1155.93) 0 0.883 One_way_ANOVA age (mean (std)) 51.42 (11.01) 48.58 (9.96) 0 0.018 One_way_ANOVA bili (median [IQR]) 1.40 [0.80,3.20] 1.30 [0.72,3.60] 0 0.842 Kruskal-Wallis chol (mean (std)) 365.01 (209.54) 373.88 (252.48) 134 0.748 One_way_ANOVA albumin (mean (std)) 3.52 (0.44) 3.52 (0.40) 0 0.874 One_way_ANOVA copper (mean (std)) 97.64 (90.59) 97.65 (80.49) 108 0.999 One_way_ANOVA alk.phos (mean (std)) 2021.30 (2183.44) 1943.01 (2101.69) 106 0.747 One_way_ANOVA ast (mean (std)) 120.21 (54.52) 124.97 (58.93) 106 0.460 One_way_ANOVA trig (mean (std)) 124.14 (71.54) 125.25 (58.52) 136 0.886 One_way_ANOVA platelet (mean (std)) 258.75 (100.32) 265.20 (90.73) 11 0.555 One_way_ANOVA protime (mean (std)) 10.65 (0.85) 10.80 (1.14) 2 0.197 One_way_ANOVA status (n (%)) 0 0.894 Chi-squared 0 83 (52.53) 85 (55.19) 1 10 (6.33) 9 (5.84) 2 65 (41.14) 60 (38.96) ascites (n (%)) 106 0.567 Chi-squared 0.0 144 (91.14) 144 (93.51) 1.0 14 (8.86) 10 (6.49) hepato (n (%)) 106 0.088 Chi-squared 0.0 85 (53.80) 67 (43.51) 1.0 73 (46.20) 87 (56.49) spiders (n (%)) 106 0.985 Chi-squared 0.0 113 (71.52) 109 (70.78) 1.0 45 (28.48) 45 (29.22) edema (n (%)) 0 0.877 Chi-squared 0.0 132 (83.54) 131 (85.06) 0.5 16 (10.13) 13 (8.44) 1.0 10 (6.33) 10 (6.49) stage (n (%)) 6 0.201 Chi-squared 1.0 12 (7.59) 4 (2.60) 2.0 35 (22.15) 32 (20.78) 3.0 56 (35.44) 64 (41.56) 4.0 55 (34.81) 54 (35.06) sex (n (%)) 0 0.421 Chi-squared f 137 (86.71) 139 (90.26) m 21 (13.29) 15 (9.74)
表可以以各种格式导出到文件中,包括latex、csv和html。通过调用dataframe上的to_format方法导出文件。例如,可以使用以下命令将mytable导出到名为“mytable.csv”的csv中:
mytable.to_csv('mytable.csv')