在python中使用regex从多行字符串中获取值

import re regex='<item> <obj1>grab1</obj1> <obj2>text<obj2> ... </item>' pattern=re.compile(regex) searchfile=open('data.dat') filetext=searchfile.read() text=re.findall(pattern,filetext) print text

2条回答

网友

1楼 · 编辑于 2024-06-01 00:34:03

多行字符串使用三个单引号或双引号作为分隔符。不需要添加\n来表示新行

您的代码将变成：

import re

regex='''<item>
<obj1>grab1</obj1>
<obj2>text</obj2>
</item>'''
pattern=re.compile(regex)
searchfile=open('data.dat')
filetext=searchfile.read()
text=re.findall(pattern,filetext)
print text

也就是说，regex的第三行可能还有另一个错误：忘记关闭<obj2>元素

最后，如果您想解析XML文档，我不建议您使用正则表达式。相反，您可能希望查看诸如lxml之类的库

考虑以下文档data.dat：

<document>
<item>
<obj1>grab1</obj1>
<obj2>text</obj2>
</item>
<otheritem></otheritem>
<item>
  <obj1>grab1</obj1>
  <obj2>text</obj2>
</item>
</document>

运行上述python代码，您将得到： ['<item>\n<obj1>grab1</obj1>\n<obj2>text</obj2>\n</item>']

由于缩进，第二个<item>被忽略

网友

2楼 · 编辑于 2024-06-01 00:34:03

请尝试以下操作

import re

regex = '''<item>
<obj1>grab1</obj1>
<obj2>text<obj2>
...
</item>'''

pattern = re.compile(regex)

with open('data.dat') as searchfile:
    filetext = searchfile.read()
    text = pattern.findall(filetext)
    print text

相关问题更多 >

编程相关推荐

热门问题

热门文章