Python中带Pvalue的FTest

2024-03-28 14:54:38 发布

您现在位置:Python中文网/ 问答频道 /正文

R允许我们计算两个总体之间的F检验:

> d1 = c(2.5579227634, 1.7774243136, 2.0025207896, 1.9518876366, 0.0, 4.1984191803, 5.6170403364, 0.0)
> d2 = c(16.93800333, 23.2837045311, 1.2674791828, 1.0889208427, 1.0447584137, 0.8971380534, 0.0, 0.0)
> var.test(d1,d2)

    F test to compare two variances

data:  d1 and d2
F = 0.0439, num df = 7, denom df = 7, p-value = 0.000523
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.008789447 0.219288957
sample estimates:
ratio of variances 
        0.04390249 

注意,它也报告p值。在

另一个例子,R给出了:

^{pr2}$

Python中的等价物是什么? 我检查了这个documentation,但似乎没有给出我想要的。在

此代码给出了不同的p值(尤其是示例2):

import statistics as stats
import scipy.stats as ss
def Ftest_pvalue(d1,d2):
    """docstring for Ftest_pvalue"""
    df1 = len(d1) - 1
    df2 = len(d2) - 1
    F = stats.variance(d1) / stats.variance(d2)
    single_tailed_pval = ss.f.cdf(F,df1,df2)
    double_tailed_pval = single_tailed_pval * 2
    return double_tailed_pval

Python给出了:

In [45]: d1 = [2.5579227634, 1.7774243136, 2.0025207896, 1.9518876366, 0.0, 4.1984191803, 5.6170403364, 0.0]
In [20]: d2 = [16.93800333, 23.2837045311, 1.2674791828, 1.0889208427, 1.0447584137, 0.8971380534, 0.0, 0.0]
In [64]: x1 = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 68.7169110318]
In [65]: x2 = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.1863361211]

In [69]: Ftest_pvalue(d1,d2)
Out[69]: 0.00052297887612346176

In [70]: Ftest_pvalue(x1,x2)
Out[70]: 1.9999999987772916

Tags: oftointestimportdfstatsd2
2条回答

一个rpy2实现:

import rpy2.robjects as robjects
def Ftest_pvalue_rpy2(d1,d2):
    """docstring for Ftest_pvalue_rpy2"""
    rd1 = (robjects.FloatVector(d1))
    rd2 = (robjects.FloatVector(d2))
    rvtest = robjects.r['var.test']
    return rvtest(rd1,rd2)[2][0]

结果是:

^{pr2}$

我要指出的是,xalglib是一个充满了统计方法的包,允许这样做: http://www.alglib.net/http://www.alglib.net/hypothesistesting/variancetests.php 但与基于scipy的原始方法相比,它的灵活性较差。在

我要指出的是,正确的双尾计算程序可以在variancetests.c中找到,如下所示:

stat=ae_minreal(xvar/yvar,yvar/xvar,_状态); *bothtails=1-(FDDistribution(df1,df2,1/stat,USTATE)-FDDistribution(df1,df2,stat,U状态))

虽然@Amit Kumar Gupta在他的评论中描述的是错误的(如果你仅仅把1和单侧p值的差值加倍,你就可以得到1以上的值)

相关问题 更多 >