我想在字符串(在文件input.txt
中)的日期和文本周围插入引号(""
)。这是我的输入文件:
created_at : October 9, article : ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad.
created_at : October 9, article : President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images) When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship.
我想在日期和文本周围加上引号,如下所示:
created_at : "October 9", article : "ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad".
created_at : "October 9", article : "President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images) When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship".
下面是我的代码,它查找逗号的索引(,
在日期之后)和文章的索引,然后通过使用它们,我想在日期周围插入引号。我也想在文本周围插入引号,但如何做到这一点?你知道吗
f = open("input.txt", "r")
for line in f:
article_pos = line.find("article")
print article_pos
comma_pos = line.find(",")
print comma_pos
您还可以查看regex库
re
。 例如re.sub
的第一个参数是您试图匹配的模式。paren()
捕获匹配项,并可与\1
一起用于第二个参数。第三个参数是文本行。你知道吗虽然您可以通过
find
和切片之类的低级操作来实现这一点,但这并不是简单或惯用的方法。你知道吗首先,我将向您展示如何按照您的方式进行:
但是你可以更容易地把线路分成几部分,咀嚼这些部分,然后把它们重新连接起来:
然后你可以把重复的部分重构成一个简单的函数,这样你就不需要写两次同样的东西(如果你以后需要写四次的话,这可能会很有用)。你知道吗
使用正则表达式更容易,但对于新手来说,这可能没有那么容易理解:
它的工作原理是捕获一个组中第一个
:
(及其后的任何空格)之前的所有内容,然后捕获第二个组中从那里到第一个逗号的所有内容,依此类推:Debuggex Demo
注意,regex的优点是我可以非常简单地说“后面的任何空格”,而使用
find
或split
时,我必须明确指定冒号两边正好有一个空格,逗号后面正好有一个空格,因为如果没有\s*
这样的表达方式,搜索“0或更多空格”会困难得多。你知道吗相关问题 更多 >
编程相关推荐