跨数据帧列应用函数

2024-06-09 11:18:59 发布

您现在位置:Python中文网/ 问答频道 /正文

这似乎得到了类似的回答,但我无法让它发挥作用。你知道吗

我有一个熊猫数据帧,看起来像下面的sig_vars。这个df有一个VAF和一个Background列。我想使用statsmodels中的ztest函数为新的p-value列指定一个p值。你知道吗

每行的p值计算如下:

from statsmodels.stats.weightstats import ztest
p_value = ztest(sig_vars.Background,value=sig_vars.VAF)[1]

我试过这样的方法,但没能成功:

def calc(x):
    return ztest(x.Background, value=x.VAF.astype(float))[1]

sig_vars.dropna().assign(pval = lambda x: calc(x)).head()

我觉得奇怪的是,这样做很好,但是:

def calc(x):
    return ztest([0.0001,0.0002,0.0001], value=x.VAF.astype(float))[1]

sig_vars.dropna().assign(pval = lambda x: calc(x)).head()

这是我的数据帧sig_vars

sig_vars = pd.DataFrame({'AO': {0: 4.0, 1: 16.0, 2: 12.0, 3: 19.0, 4: 2.0},
 'Background': {0: nan,
  1: [0.00018832391713747646, 0.0002114408734430263, 0.000247843759294141],
  2: nan,
  3: [0.00023965141612200435,
   0.00018864365214110544,
   0.00036566589684372596,
   0.0005452562704471102],
  4: [0.00017349063150589867]},
 'Change': {0: 'T>A', 1: 'T>C', 2: 'T>A', 3: 'T>C', 4: 'C>A'},
 'Chrom': {0: 'chr1', 1: 'chr1', 2: 'chr1', 3: 'chr1', 4: 'chr1'},
 'ConvChange': {0: 'T>A', 1: 'T>C', 2: 'T>A', 3: 'T>C', 4: 'C>A'},
 'DP': {0: 16945.0, 1: 16945.0, 2: 16969.0, 3: 16969.0, 4: 16969.0},
 'Downstream': {0: 'NaN', 1: 'NaN', 2: 'NaN', 3: 'NaN', 4: 'NaN'},
 'Gene': {0: 'TIIIa', 1: 'TIIIa', 2: 'TIIIa', 3: 'TIIIa', 4: 'TIIIa'},
 'ID': {0: '86.fastq/onlyProbedRegions.vcf',
  1: '86.fastq/onlyProbedRegions.vcf',
  2: '86.fastq/onlyProbedRegions.vcf',
  3: '86.fastq/onlyProbedRegions.vcf',
  4: '86.fastq/onlyProbedRegions.vcf'},
 'Individual': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1},
 'IntEx': {0: 'TIII', 1: 'TIII', 2: 'TIII', 3: 'TIII', 4: 'TIII'},
 'Loc': {0: 115227854, 1: 115227854, 2: 115227855, 3: 115227855, 4: 115227856},
 'Upstream': {0: 'NaN', 1: 'NaN', 2: 'NaN', 3: 'NaN', 4: 'NaN'},
 'VAF': {0: 0.00023605783416937148,
  1: 0.0009442313366774859,
  2: 0.0007071719017031057,
  3: 0.0011196888443632507,
  4: 0.00011786198361718427},
 'Var': {0: 'A', 1: 'C', 2: 'A', 3: 'C', 4: 'A'},
 'WT': {0: 'T', 1: 'T', 2: 'T', 3: 'T', 4: 'C'}})

Tags: 数据valuecalcvarsnanfastqvcfbackground