import re
# This will make sure citi.txt is properly closed after opening it.
# infl.read() will read the whole file as single string, so no need to loop
with open('citi.txt', 'r') as infl:
hand = infl.read()
# And look for occurences of your string
match = re.findall('(?:"(.*?)")', hand)
if match:
print match
例如,如果line == 'This is "a sample" line with "two quoted" substrings',则此代码将打印['a sample', 'two quoted']
您可以使用
re.findall('(?:"(.*?)")', line)
仅从行中提取引用的文本,而不是打印整行,即使每行出现多个。您的代码可以修改如下:例如,如果
line == 'This is "a sample" line with "two quoted" substrings'
,则此代码将打印['a sample', 'two quoted']
编辑:适用于unicode
您的引号似乎是unicode字符。注意“,”和“,”之间的细微差别(我最初也没有发现)。在
我的原始答案和您的代码示例基于ASCII字符串,但您需要这样的正则表达式字符串:
^{pr2}$说明:}表示右双引号,则
\u201c
表示左双引号,而{u
将字符串标记为Unicode。在现在可以使用您提供的摘录。在
相关问题 更多 >
编程相关推荐