删除包含两个单词的引号,并删除它们之间的逗号

2024-04-28 12:15:34 发布

您现在位置:Python中文网/ 问答频道 /正文

跟进Python to replace a symbol between between 2 words in a quote

扩展输入和预期输出:

尝试将第二行中两个单词Durango和PC之间的逗号替换为&然后删除引号。第三行与Orbis和PC相同,第四行在引号中有两个单词组合,我想处理“AAA字符技术,软件”,“Durango,Orbis,PC”

我想保留使用Python的其余行

输入

2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering,"Durango, PC",55,Reopened
3,SIN-Audio,AAA - Audio,"Orbis, PC",13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,"AAA - Character Tech, SOF - UPIs","Durango, Orbis, PC",29,Waiting For
...
... 
...

像这样,在我的样本中可以有100行。因此,预期产出是:

2,SIN-Rendering,Core Tech - Rendering,PC,147,Reopened
2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened
3,SIN-Audio,AAA - Audio, Orbis & PC,13,Open
LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango, Orbis & PC,29,Waiting For
...
...
...

到目前为止,我可以考虑逐行阅读,如果行中包含引号,则将其替换为无字符,但替换其中的符号是我一直坚持的

以下是我现在的情况:

for line in lines:
            expr2 =  re.findall('"(.*?)"', line)
            if len(expr2)!=0:
                expr3 = re.split('"',line)
                expr4 = expr3[0]+expr3[1].replace(","," &")+expr3[2]
                print >>k, expr4
            else:
                print >>k, line

但它不考虑第四行的情况?也可以有3个以上的组合。例如

3,SIN-Audio,"AAA - Audio, xxxx, yyyy","Orbis, PC","13, 22",Open 

我想做这个 3,SIN-Audio,AAA - Audio & xxxx & yyyy, Orbis & PC, 13 & 22,Open

如何做到这一点,有什么建议吗?学习Python


Tags: corelinesinopenaudiotechreplace引号
2条回答

因此,通过将输入文件视为.csv,我们可以轻松地将行转换为易于处理的内容

例如

2,Kenny Chong,Core Tech - Rendering, Durango & PC,55,Reopened

读作:

['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango, PC', '55', 'Reopened']

然后,用_&(空格)替换,的所有实例,我们将得到以下行:

['2', 'Kenny Chong', 'Core Tech - Rendering', 'Durango & PC', '55', 'Reopened']

它在一行中替换了,的多个实例,当最后写入时,我们不再有原来的双引号

这里是代码,假设in.txt是您的输入文件,它将写入out.txt

import csv

with open('in.txt') as infile:
    reader = csv.reader(infile)

    with open('out.txt', 'w') as outfile:
        for line in reader:
            line = list(map(lambda s: s.replace(',', ' &'), line))
            outfile.write(','.join(line) + '\n')

第四行输出为:

LTY-168499,[PC][PS4][XB1] Missing textures from Fort Capture NPC face,3,CTU-CharacterTechBacklog,AAA - Character Tech & SOF - UPIs,Durango & Orbis & PC,29,Waiting For

请检查一下:我找不到一个表达式可以这样做。所以我做的有点复杂。如果我能找到更好的方法,将会更新(Python3)

import re
st = "3,SIN-Audio,\"AAA - Audio, xxxx, yyyy\",\"Orbis, PC\",\"13, 22\",Open"
found = re.findall(r'\"(.*)\"',st)[0].split("\",\"")
final = ""
for word in found:
    final = final + (" &").join(word.split(","))+","
result = re.sub(r'\"(.*)\"',final[:-1],st)
print(result)

相关问题 更多 >