擅长:python、mysql、java
<p>该表实际上是从<a href="https://www.state.nj.us/mvc/locations/agency.htm" rel="nofollow noreferrer">this</a>站点加载的</p>
<p>要仅获取红色文本,您可以使用CSS选择器<code>soup.select('font[color="red"]')</code>,正如@Mr.Polywhill所提到的:</p>
<pre><code>import urllib.request
from bs4 import BeautifulSoup
class Scraper:
def __init__(self, site):
self.site = site
def scrape(self):
r = urllib.request.urlopen(self.site)
html = r.read()
parser = "html.parser"
soup = BeautifulSoup(html, parser)
tabledmv = soup.select('font[color="red"]')[1:]
for tag in tabledmv:
print(tag.get_text())
website = "https://www.state.nj.us/mvc/locations/agency.htm"
Scraper(website).scrape()
</code></pre>