如何在python中使用condition组合两个文件中的行？

12319000 -64,7357668067227 -0,1111052148685535 12319000 -79,68527661064425 -0,13231739777754026 12319000 -94,69642857142858 -0,15117839559513543 12319000 -109,59301470588237 -0,18277783185642743 12319001 99,70264355742297 0,48329515727315125 12319001 84,61113445378152 0,4060446341409862 12319001 69,7032037815126 0,29803063228455073 12319001 54,93886554621849 0,20958105041136763 12319001 39,937394957983194 0,13623056582981297 12319001 25,05574229691877 0,07748669438398018 12319001 9,99716386554622 0,028110643107892755

12319000 -94,69642857142858 -0,15117839559513543 mutant 1 12319000 -109,59301470588237 -0,18277783185642743 mutant 1 12319001 99,70264355742297 0,48329515727315125 mutant 2 12319001 84,61113445378152 0,4060446341409862 mutant 2

oocytes = open(file_with_oocytes, 'r') results = open(os.path.join(path, 'results.csv'), 'r') results_new = open(os.path.join(path, 'results_with_oocytes.csv'), 'w') for line in results: for lines in oocytes: if lines[0:7] in line: print line + lines[12:]

12319000 99,4952380952381 0,3011778623990699 mutant 1 12319000 99,4952380952381 0,3011778623990699 mutant 2 12319000 99,4952380952381 0,3011778623990699 mutant 3

3条回答

网友

1楼 · 编辑于 2024-04-27 08:27:44

请注意，除了第二个文件中文件扩展名的长度外，此解决方案不依赖任何字段的长度。在

# make a dict keyed on the filename before the extension
# with the other two fields as its value
file2dict = dict((row[0][:-4], row[1:])  
                     for row in (line.split() for line in file2))

# then add to the end of each row 
# the values to it's first column
output = [row + file2dict[row[0]] for row in (line.split() for line in file1)]

仅用于测试目的，我使用：

^{pr2}$

你应该只使用普通的文件对象。测试数据的输出为：

   [['12319000', '-64,7357668067227', '-0,1111052148685535', 'mutant', '1'],
    ['12319000', '-79,68527661064425', '-0,13231739777754026', 'mutant', '1'],
    ['12319000', '-94,69642857142858', '-0,15117839559513543', 'mutant', '1'],
    ['12319000', '-109,59301470588237', '-0,18277783185642743', 'mutant', '1'],
    ['12319001', '99,70264355742297', '0,48329515727315125', 'mutant', '2'],
    ['12319001', '84,61113445378152', '0,4060446341409862', 'mutant', '2'],
    ['12319001', '69,7032037815126', '0,29803063228455073', 'mutant', '2'],
    ['12319001', '54,93886554621849', '0,20958105041136763', 'mutant', '2'],
    ['12319001', '39,937394957983194', '0,13623056582981297', 'mutant', '2'],
    ['12319001', '25,05574229691877', '0,07748669438398018', 'mutant', '2'],
    ['12319001', '9,99716386554622', '0,028110643107892755', 'mutant', '2']]

网友

2楼 · 编辑于 2024-04-27 08:27:44

Python中的文件句柄具有状态；也就是说，它们的工作方式与列表不同。您可以反复遍历一个列表，每次都可以得到所有的值。另一方面，文件有一个位置，下一个read()将从该位置开始。迭代文件时，每行read()。当您到达最后一行时，文件指针位于文件的末尾。文件末尾的read()返回字符串''！在

您需要做的是在开始时在oocytes文件中读取一次，然后存储这些值，可能如下所示：

oodict = {}
for line in oocytes:
    oodict[line[0:7]] = line[12:]

for line in results:
    results_key = line[0:7]
    if results_key in oodict:
        print oodict[results_key] + line

网友

3楼 · 编辑于 2024-04-27 08:27:44

好吧，简单的事情是，你在行尾打印了一行新行，你可以用[0:-1]行去掉它

接下来，“lines[0:7]”只测试行的前7个字符—您想测试8个字符。这就是为什么相同值的“line”被打印出3个不同的突变值。在

最后，您需要关闭和重新打开结果中每一行的卵母细胞。如果未能这样做，则在第一行结果之后结束输出。在

实际上，另一个答案更好-不要为每一行结果打开或关闭卵母细胞-打开它并读入（到一个列表中）一次，然后为每一行结果迭代该列表。在

相关问题更多 >

编程相关推荐

热门问题

热门文章