使用Python搜索每一行文本的特定ID号码,如果有匹配则添加到列表中

2024-04-28 06:45:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在寻找在文本行中搜索ID的方法。一旦找到这个ID,我想把整行文字添加到一个列表中

这就是我目前所拥有的

import subprocess
import re

bash = ("curl************************>> ~/Desktop/output.txt")
output = subprocess.check_output([bash], shell=True, stderr=subprocess.STDOUT)
with open('/******/output.txt', 'r') as myfile:
    everything = myfile.read().replace('\n', '')
    brokendownbyline = re.findall(r'{\"EGG\"(.*?)SHELL',str(everything))
for i in brokendownbyline:
print(i)

此代码打印如下内容:

"addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf"
"alkgad fganf daohdg o aunf g aoh oahf 9876 asl kdfna lk jfds"
"kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf"
"asdkjfnajs dhfuioahfj a bnfgiuabf 3456asdkl fafaadhaa"
"ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis"

同样,每一行都有一个ID。我只是在找一个身份证。一旦找到ID#,我希望这一行或几行文本被添加到一个列表中,其他所有内容都可以忽略


Tags: 方法文本importretxtbashid内容
3条回答

我不知道你为什么用bash来做这个。你也没有很好地解释你的确切目标,但是根据我的解释,对于给定的ID,比如说34,你想要找到包含ID 34的所有行,并将整行添加到一个列表中

这很容易实现,如:

import os

line_list = []

with open(os.path.join(os.environ["HOMEPATH"], "Desktop/output.txt")) as f:
    for line in f:
        if "34" in line:
            line_list.append(line)

for l in line_list:
    print(l)    

所以这里的其他解决方案建议使用in关键字,这会很快,但是您不能传递一个列表来与字符串进行比较

["f","b"] in "foo boo" #type error

相反,我将使用regex进行匹配,然后使用set intersection来比较两者

#update this list as needed
IDS = ["1234", "4567", "4637"]
rows = ["addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf",
        "alkgad fganf daohdg o aunf g aoh oahf 9876 asl kdfna lk jfds",
        "kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf",
        "asdkjfnajs dhfuioahfj a bnfgiuabf 3456asdkl fafaadhaa",
        "ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis"]

for row in rows:
  if(set(IDS)&set(re.findall(r"[0-9]+", row))): #& finds intersection
    print(row)

只打印匹配的行:

addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf
kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf
ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis
import subprocess
import re

listed = []
bash = ("curl************************>> ~/Desktop/output.txt")
output = subprocess.check_output([bash], shell=True, stderr=subprocess.STDOUT)
with open('/******/output.txt', 'r') as myfile:
    everything = myfile.read()
    brokendownbyline = re.findall(r'{\"EGG\"(.*?)SHELL',str(everything))
    for i in brokendownbyline:
        if "1234564789" in i:
            listed.append(i)
for j in listed:
    print(j)

这是对我有效的最终解决方案

相关问题 更多 >