擅长:python、mysql、java
<p>该网站是动态加载的,因此<code>requests</code>不支持它。我们可以使用<a href="https://selenium-python.readthedocs.io/" rel="nofollow noreferrer">Selenium</a>作为刮取页面的替代方法</p>
<p>安装时使用:<code>pip install selenium</code></p>
<p>从<a href="https://sites.google.com/a/chromium.org/chromedriver/downloads" rel="nofollow noreferrer">here</a>下载正确的ChromeDriver</p>
<pre><code>from time import sleep
from selenium import webdriver
from bs4 import BeautifulSoup
URL = "https://www.reuters.com/article/us-usa-banks-conference-jpmorgan/jpmorgan-ceo-dimon-sees-u-s-economic-expansion-continuing-idUSKCN1IX508"
driver = webdriver.Chrome(r"c:\path\to\chromedriver.exe")
driver.get(URL)
# Wait for the page to fully render
sleep(5)
soup = BeautifulSoup(driver.page_source, "html.parser")
for tag in soup.find_all("time"):
print(tag.get_text(strip=True))
driver.quit()
</code></pre>
<p>输出:</p>
<pre><code>June 1, 2018
9:21 AM
Updated 2 years ago
</code></pre>