使用namesp美化搜索属性

<body:document-content> <style:style style:name="P1" style:family="paragraph" style:parent-style-name="Standard"> <style:text-properties officeooo:paragraph-rsid="00118689"/> </style:style> <body:text> <text:sequence-decls> <text:sequence-decl text:display-outline-level="0" text:name="Illustration"/> <text:sequence-decl text:display-outline-level="0" text:name="Table"/> </text:sequence-decls> <text:p text:style-name="P1">This is example document</text:p> <text:p text:style-name="P1"/> <text:p text:style-name="P1">hello world</text:p> <text:p text:style-name="P1"/> <text:p text:style-name="P1"> <text:a xlink:type="simple" xlink:href="https://example.com">https://example.com</text:a> </text:p> <text:p text:style-name="P1"/> <text:p text:style-name="P1"/> </body:text> </body:document-content>

1条回答

网友

1楼 · 发布于 2024-04-26 01:26:28

这是一个过于简单的答案，但不清楚您要做什么，或者您的XML可能面临哪些变化。如果您不需要将XPath用于更复杂的操作，那么示例中的XML建议您只需搜索text:a元素（唯一具有{}属性的元素），如果它确实是您希望除去的text:a“line”（元素节点）。在

from bs4 import BeautifulSoup

with open('test.xml') as x: # text.xml is the xml from your post
    doc = BeautifulSoup(x)
    #print(doc.find_all( 'text:a' ))  # see that it gets all text:a elements
    [s.extract() for s in doc('text:a')] # extracts text:a elements
    print(doc)

相关问题更多 >

编程相关推荐

热门问题

热门文章