使用索引python在字符串中插入引号

2024-06-09 14:55:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我想在字符串(在文件input.txt中)的日期和文本周围插入引号("")。这是我的输入文件:

created_at : October 9, article :   ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad.
created_at : October 9, article :    President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images)  When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship.

我想在日期和文本周围加上引号,如下所示:

created_at : "October 9", article :   "ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad".
created_at : "October 9", article :    "President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images)  When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship".

下面是我的代码,它查找逗号的索引(,在日期之后)和文章的索引,然后通过使用它们,我想在日期周围插入引号。我也想在文本周围插入引号,但如何做到这一点?你知道吗

f = open("input.txt", "r")
for line in f:
    article_pos = line.find("article")
    print article_pos
    comma_pos = line.find(",")
    print comma_pos

Tags: andofthetoinwitharticleit
2条回答

您还可以查看regex库re。 例如

>>> import re
>>> print(re.sub(r'created_at:\s(.*), article:\s(.*)',
...              r'created_at: "\1", article: "\2"',
...              'created_at: October 9, article: ...'))
created_at: "October 9", article: "..."

re.sub的第一个参数是您试图匹配的模式。paren ()捕获匹配项,并可与\1一起用于第二个参数。第三个参数是文本行。你知道吗

虽然您可以通过find和切片之类的低级操作来实现这一点,但这并不是简单或惯用的方法。你知道吗

首先,我将向您展示如何按照您的方式进行:

comma_pos = line.find(", ")
first_colon_pos = line.find(" : ")
second_colon_pos = line.find(" : ", comma_pos)
line = (line[:first_colon_pos+3] + 
        '"' + line[first_colon_pos+3:comma_pos] + '"' +
        line[comma_pos:second_colon_pos+3] +
        '"' + line[second_colon_pos+3:] + '"')

但是你可以更容易地把线路分成几部分,咀嚼这些部分,然后把它们重新连接起来:

dateline, article = line.split(', ', 1)
key, value = dateline.split(' : ')
dateline = '{} : "{}"'.format(key, value)
key, value = article.split(' : ')
article = '{} : "{}"'.format(key, value)
line = '{}, {}'.format(dateline, article)

然后你可以把重复的部分重构成一个简单的函数,这样你就不需要写两次同样的东西(如果你以后需要写四次的话,这可能会很有用)。你知道吗

使用正则表达式更容易,但对于新手来说,这可能没有那么容易理解:

line = re.sub(r'(.*?:\s*)(.*?)(\s*,.*?:\s*)(.*)', r'\1"\2"\3"\4"', line)

它的工作原理是捕获一个组中第一个:(及其后的任何空格)之前的所有内容,然后捕获第二个组中从那里到第一个逗号的所有内容,依此类推:

(.*?:\s*)(.*?)(\s*,.*?:\s*)(.*)

Regular expression visualization

Debuggex Demo

注意,regex的优点是我可以非常简单地说“后面的任何空格”,而使用findsplit时,我必须明确指定冒号两边正好有一个空格,逗号后面正好有一个空格,因为如果没有\s*这样的表达方式,搜索“0或更多空格”会困难得多。你知道吗

相关问题 更多 >