使用具有相同名称的多个标记的beauthulsoup

<g class="1581 sqw_sv5" style="cursor: pointer;"> <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#ffffff" style="stroke-width: 3.6; stroke-opacity: 0.5; stroke-linecap: round; fill-opacity: 0;"> </path> <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#f95a0b" style="stroke-width: 1.2; stroke-linecap: round; fill-opacity: 0;"> </path>

2条回答

网友

1楼 · 编辑于 2024-06-16 08:25:09

这是我对你问题的解答。我的问题是我的答案可能过于具体。只有当style的值总是"stroke-width: 1.2; stroke-linecap: round; fill-opacity: 0;"并且整个文档中只有一个这样的path元素时，这才有效。在

此解决方案的思想是通过查找包含所需属性的所需元素的唯一性来快速缩小元素的范围。在

`
from bs4 import BeautifulSoup

html = """"<g class="1581 sqw_sv5" style="cursor: pointer;">
 <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#ffffff" style="stroke-width: 3.6; stroke-opacity: 0.5; stroke-linecap: round; fill-opacity: 0;">
 </path>
 <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#f95a0b" style="stroke-width: 1.2; stroke-linecap: round; fill-opacity: 0;">
 </path>"""

soup = BeautifulSoup(html, "html.parser")
# get the desired 'path' element using the 'style' that identifies it
desired_element =  soup.find("path", {"style" : "stroke-width: 1.2; stroke-linecap: round; fill-opacity: 0;"})
# get the attribute value from the extracted element
desired_attribute = desired_element["stroke"]
print (desired_attribute)
# prints #f95a0b
`

如果这种方法是不可行的，那么您可能必须使用BeautifulSoups的next_sibling或{}方法。基本上查找第一个path元素，它是您当前用代码完成的，然后从那里“跳转”到下一个path元素，它包含您需要的内容。在

查找下一步：Beautifulsoup - nextSibling

下一个兄弟姐妹：https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling

网友

2楼 · 编辑于 2024-06-16 08:25:09

您需要使用find\u all首先查找所有路径的，然后提取最后一个路径：

h = """<g class="1581 sqw_sv5" style="cursor: pointer;">
 <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#ffffff" style="stroke-width: 3.6; stroke-opacity: 0.5; stroke-linecap: round; fill-opacity: 0;">
 </path>
 <path d="M397.696,126.554C397.696,126.554,404.57504,140.2417375,404.57504,140.2417375" stroke="#f95a0b" style="stroke-width: 1.2; stroke-linecap: round; fill-opacity: 0;">
 </path>"""
soup = BeautifulSoup(h)
shots = soup.find_all('g')
for shot in shots:
    print(shot.find_all("path", stroke=True)[-1]["stroke"]

使用shot.path['stroke']等同于使用shot.find("path")['stroke']，它只返回第一个路径。在

或者使用nth类型的也可以工作，具体取决于html的结构：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章