<p>这里有个窍门pd系列(1,索引=…)并让熊猫对齐:</p>
<pre><code>In [11]: s = df["Pos"].apply(lambda x: pd.Series(1, pos_to_gene[x])).stack(0)
In [12]: s
Out[12]:
0 GENE1 1
1 GENE1 1
GENE2 1
3 GENE3 1
dtype: float64
</code></pre>
<p>您可以重置索引,然后简单地加入:</p>
<pre><code>In [13]: s.index.names = [None, "Gene"]
In [14]: gene = s.reset_index("Gene")[["Gene"]]
In [15]: gene
Out[15]:
Gene
0 GENE1
1 GENE1
1 GENE2
3 GENE3
In [16]: gene.join(df)
Out[16]:
Gene Pos MedialIIvsD LateralIIvsD MedialP02IIvsD MedialP09IIvsD
0 GENE1 chr1_-_12200 0.557431 0.066554 0.738343 0.029935
1 GENE1 chr1_-_12600 0.737887 0.069167 0.829568 0.409495
1 GENE2 chr1_-_12600 0.737887 0.069167 0.829568 0.409495
3 GENE3 chr1_-_172800 0.729035 0.035198 0.866111 0.385711
</code></pre>
<p>如果要包含NaN行(答案中没有),那么outer join:</p>
<pre><code>In [17]: gene.join(df, how="outer")
Out[17]:
Gene Pos MedialIIvsD LateralIIvsD MedialP02IIvsD MedialP09IIvsD
0 GENE1 chr1_-_12200 0.557431 0.066554 0.738343 0.029935
1 GENE1 chr1_-_12600 0.737887 0.069167 0.829568 0.409495
1 GENE2 chr1_-_12600 0.737887 0.069167 0.829568 0.409495
2 NaN chr1_-_48400 0.349833 0.600912 0.964103 0.765195
3 GENE3 chr1_-_172800 0.729035 0.035198 0.866111 0.385711
</code></pre>
<hr/>
<p>或者,您可以在纯python中创建<code>gene</code>(而不是使用apply):</p>
<pre><code>inds, gens = [], []
for i, p in df["Pos"].iteritems():
for g in pos_to_gene[p]:
inds.append(i)
gens.append(g)
gene = pd.Series(gens, inds)
</code></pre>