组导航忽略指定的路径

2024-04-23 14:28:06 发布

您现在位置：Python中文网/ 问答频道 /正文

7723

网友

男 | 程序猿一只，喜欢编程写python代码。

似乎我的BeautifulSoup解析器忽略了我请求的元素的路径，并返回找到的第一个标记，该标记包含路径中最后一个元素的名称，而不管到该点的路径如何。你知道吗

XML格式：

<root>
    <firstcategory>
        <subcategory>
            <id>123</id>
            <name>SubcategX</name>
        </subcategory>
        <id>789</id>
        <name>Category1</name>
    </firstCategory>
</root>

Python代码：

from bs4 import BeautifulSoup

testXML = "<root><firstcategory><subcategory><id>123</id><name>SubcategX</name></subcategory><id>789</id><name>Category1</name></firstCategory></root>"

soup = BeautifulSoup(testXML)
#below should be 789
categID = soup.root.firstcategory.id
#this prints 123, which corresponds to the path root.firstcategory.subcategory.id, not root.firstcategory.id
print("categID = %s" % categID)

为什么BeautifulSoup只找到层次结构中的第一个id标记而不考虑指定的路径？你知道吗

Tags： name 标记路径 id 元素 root soup beautifulsoup

1条回答

网友

1楼 · 发布于 2024-04-23 14:28:06

使用点语法时，BeautifulSoup递归地搜索所有祖先。它碰巧先找到子类别<id>。你知道吗

要防止递归，可以执行以下操作：

soup.firstcategory.find('id', recursive=False)

这是docs for the recursive argument。你知道吗

组导航忽略指定的路径

相关问题更多 >

编程相关推荐

热门问题

热门文章

组导航忽略指定的路径

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >