如何从lxml中选择html中的节点？

<div id="names"> <h2>Names and Synonyms</h2> <div class="ds"> <button class="toggle1Col" title="Toggle display between 1 column of wider results and multiple columns.">↔</button> <h3>Name of Substance</h3> <ul> <li id="ds2"><div>Acetaldehyde</div></li> </ul> <h3>MeSH Heading</h3> <ul> <li id="ds3"><div>Acetaldehyde</div></li> </ul> </div>

1条回答

网友

1楼 · 发布于 2024-05-14 20:46:21

您只需检查Name of Substance或{}是否在网页文本中，然后选择内容。在

from lxml import html
import requests
import csv
page = requests.get('http://chem.sis.nlm.nih.gov/chemidplus/rn/75-07-0')
tree = html.fromstring(page.text)

if ("Name of Substance" in page.text):
    chem_name = tree.xpath('//*[text()="Name of Substance"]/..//div')[0].text_content()
else:
    chem_name = ""

if ("MeSH Heading" in page.text):
    mesh_name = tree.xpath('//*[text()="MeSH Heading"]/..//div')[1].text_content()
else:
    mesh_name = ""

names1 = [chem_name, mesh_name]
with open('testchem.csv', 'wb') as myfile:
    wr = csv.writer(myfile)
    wr.writerow(names1)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何从lxml中选择html中的节点？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >