具有不同长度的另一个数据帧的子集数据帧

>>>df1['pos1'].between(df2.start,df2.stop) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/dist-packages/pandas/core/series.py", line 2412, in between lmask = self >= left File "/usr/lib/python2.7/dist-packages/pandas/core/ops.py", line 699, in wrapper raise ValueError('Series lengths must match to compare') ValueError: Series lengths must match to compare

2条回答

网友

1楼 · 编辑于 2024-06-06 08:03:14

有人可能有一个更优雅的解决方案，但在我的头脑中，我会将df2与{}连接两次，这样就可以在一个数据集中获得所有内容，而且比较也很容易。在

df2基本上是一个查找表，df2.chr应该分别与df1.chr1和{}匹配。在

df_all = df1.merge(df2,
                   how='inner',
                   left_on='chr1',
                   right_on='chr') \
            .merge(df2,
                   how='inner',
                   left_on='chr2',
                   right_on='chr',
                   suffixes=('_r1', '_r2'))

注意后缀。因此，pos1将被测试在start_r1-stop_r1范围内，pos2将被测试在start_r2-stop_r2范围内。在

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章