使用python匹配文件

2024-06-16 14:14:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个文本文件

文件1.txt

ACVR2B

ASCL2

.

.

.

.

文件2.txt

>
ACVR2B:chr3:38309062 [-1000..50](+) [human, Homo sapiens]
cacttggttgggtttaaatgaccactccccgccctagctgtgcctttgag
tgtgtggcaagggattgcagacgggagactgcttgtcaattcagggaagt
cagcctctttctgccacttaattcgcccatcagtagagatccgactttcc
cacggttcactgtcacccccattgcacaggtggggaatccaaggcacaga
ggcgtctgggccagagtcccgggactatccactcattccggggttgtagg
gcagcatgtgtcaggagcttgggctcgagcgtgcggggcactaattgcga
gtgcagtggccacaggctcccgggcaaagtggtcaggagcccattcccgc
cacctccgtccgcgcgccccgccctgtcctccagcaagttggtgcacgcg
cgtcctcccacagggcgcgggatgggggcggggcctccgtgttgttgttt
ctctggggccccgcccctcaggggggcgcggaccacgcggccggagggcc
cgcctcccctccccctccccgcccagcccgcccgcctctttgtatccaac
atccgggtttccccgctggctcggctccgtggccgccgctcaggagccat
tttggactcggttcagctcccctcccccccacccctccccccgttcatgg
cccctccggactcggcccctgcgcccggggcccgggcccagccccgccgc
gctatgcctgagtcgggcgcgccccggcccgtgccccgccgccgcccccc
ggcccccgcgtcgccccggagcccgggccgcagcctgcgccgcccgcagc
ggccctgagcccggccccgccgaccggcccttggagcccgaacgctgctc

>
ASCL2:chr11:2250253 [-1000..50](-) [human, Homo sapiens]
ggccttacagaatgtgatcgcgcgagggggagggcgaagcgtggcgggag
ggcgaggcgaaggaaggagggcgtgagaaaggcgacggcggcggcgcgga
ggagggttatctatacatttaaaaaccagccgcctgcgccgcgcctgcgg
agacctgggagagtccggccgcacgcgcgggacacgagcgtcccacgctc
cctggcgcgtacggcctgccaccactaggcctcctatccccgggctccag
acgacctaggacgcgtgccctggggagttgcctggcggcgccgtgccaga
agcccccttggggcgccacagttttccccgtcgcctccggttcctctgcc
tgcaccttcctgcggcgcgccgggacctggagcgggcgggtggatgcagg
cgcgatggacggcggcacactgcccaggtccgcgccccctgcgccccccg
tccctgtcggctgcgctgcccggcggagacccgcgtccccggaactgttg
cgctgcagccggcggcggcgaccggccaccgcagagaccggaggcggcgc
agcggccgtagcgcggcgcaatgagcgcgagcgcaaccgcgtgaagctgg
tgaacttgggcttccaggcgctgcggcagcacgtgccgcacggcggcgcc
agcaagaagctgagcaaggtggagacgctgcgctcagccgtggagtacat
ccgcgcgctgcagcgcctgctggccgagcacgacgccgtgcgcaacgcgc
tggcgggagggctgaggccgcaggccgtgcggccgtctgcgccccgcggg
ccgccagggaccaccccggtcgccgcctcgccctcccgcgcttcttcgtc
cccgggccgcgggggcagctcggagcccggctccccgcgttccgcctact
cgtcggacgacagcggctgcgaaggcgcgctgagtcctgcggagcgcgag
ctactcgacttctccagctggttagggggctactgagcgccctcgaccta
tgaggtaacagccgggaggcagggaggagggagggccgggggccggggtg
ggggacgaaggcgcaggaagcgcgcagggaacgagaccgaaggaaggagc
gggaaggagagcgcagccgccgcctggccctgcgcgccccgggagcgccg
tgcggccctgcccgcgggctccgggtgtgcgcggggcggcgccgcggaac
atgacggcgccctgggtggccctcgccctcctctggggatcgctgtgcgc
.
.
.
.
.

现在我的任务是编写python脚本来匹配文件1和文件2如果文件2中有来自文件1的单词,那么它应该打印所有数据文件2作为示例

如果文件1中的ASCL2在文件2中,那么它应该给出如下结果

>
ASCL2:chr11:2250253 [-1000..50](-) [human, Homo sapiens]
ggccttacagaatgtgatcgcgcgagggggagggcgaagcgtggcgggag
ggcgaggcgaaggaaggagggcgtgagaaaggcgacggcggcggcgcgga
ggagggttatctatacatttaaaaaccagccgcctgcgccgcgcctgcgg
agacctgggagagtccggccgcacgcgcgggacacgagcgtcccacgctc
cctggcgcgtacggcctgccaccactaggcctcctatccccgggctccag
acgacctaggacgcgtgccctggggagttgcctggcggcgccgtgccaga
agcccccttggggcgccacagttttccccgtcgcctccggttcctctgcc
tgcaccttcctgcggcgcgccgggacctggagcgggcgggtggatgcagg
cgcgatggacggcggcacactgcccaggtccgcgccccctgcgccccccg
tccctgtcggctgcgctgcccggcggagacccgcgtccccggaactgttg
cgctgcagccggcggcggcgaccggccaccgcagagaccggaggcggcgc
agcggccgtagcgcggcgcaatgagcgcgagcgcaaccgcgtgaagctgg
tgaacttgggcttccaggcgctgcggcagcacgtgccgcacggcggcgcc
agcaagaagctgagcaaggtggagacgctgcgctcagccgtggagtacat
ccgcgcgctgcagcgcctgctggccgagcacgacgccgtgcgcaacgcgc
tggcgggagggctgaggccgcaggccgtgcggccgtctgcgccccgcggg
ccgccagggaccaccccggtcgccgcctcgccctcccgcgcttcttcgtc
cccgggccgcgggggcagctcggagcccggctccccgcgttccgcctact
cgtcggacgacagcggctgcgaaggcgcgctgagtcctgcggagcgcgag
ctactcgacttctccagctggttagggggctactgagcgccctcgaccta
tgaggtaacagccgggaggcagggaggagggagggccgggggccggggtg
ggggacgaaggcgcaggaagcgcgcagggaacgagaccgaaggaaggagc
gggaaggagagcgcagccgccgcctggccctgcgcgccccgggagcgccg
tgcggccctgcccgcgggctccgggtgtgcgcggggcggcgccgcggaac
atgacggcgccctgggtggccctcgccctcctctggggatcgctgtgcgc

..

有人能帮我吗

提前谢谢


Tags: 文件txthumanhomosapienschr11agacctgggagagtccggccgcacgcgcgggacacgagcgtcccacgctcggcgaggcgaaggaaggagggcgtgagaaaggcgacggcggcggcgcgga
1条回答
网友
1楼 · 发布于 2024-06-16 14:14:31

首先,打开文件1并将其内容读入一个集合:

with open('text1.txt', 'r') as f:
    search_set =  set(f.read().split('\n')

然后,提取文件2的第一部分并测试集合中的成员身份:

f = open('file2.txt', 'r')
test_string = f.readline()

if test_string.split(':')[0] in search_set:
    #There's a match:
    print(test_string + f.read())
else:
    print('no match')

f.close()

相关问题 更多 >