擅长:python、mysql、java
<p>你知道html-xpath吗?
使用lxml lib和xpath提取html元素是一种快速的方法。</p>
<pre><code>import lxml
doc = lxml.html.document_fromstring(html_content)
title_element = doc.xpath("//title")
website_title = title_element[0].text_content().strip()
meta_description_element = doc.xpath("//meta[@property='description']")
website_meta_description = meta_description_element[0].text_content().strip()
</code></pre>