Python Craiglist报废显示空lis

import pandas as pd import requests %pylab inline url_base = 'http://houston.craigslist.org/search/apa' params = dict(bedrooms=2) rsp = requests.get(url_base, params=params) print(rsp.text[:500]) from bs4 import BeautifulSoup as bs4 html = bs4(rsp.text, 'html.parser') print(html.prettify()[:1000])

<!DOCTYPE html> <html class="no-js"> <head> <title> houston apartments / housing rentals - craigslist </title> <meta content="houston apartments / housing rentals - craigslist" name="description"> <meta content="IE=Edge" http-equiv="X-UA-Compatible"/> <link href="https://houston.craigslist.org/search/apa" rel="canonical"> <link href="https://houston.craigslist.org/search/apa? format=rss&min_bedrooms=2" rel="alternate" title="RSS feed for craigslist | houston apartments / housing rentals - craigslist " type="application/rss+xml"> <link href="https://houston.craigslist.org/search/apa? s=120&min_bedrooms=2" rel="next"> <meta content="width=device-width,initial-scale=1" name="viewport"> <link href="//www.craigslist.org/styles/cl.css? v=a14d0c65f7978c2bbc0d780a3ea7b7be" media="all" rel="stylesheet" type="text/css"> <link href="//www.craigslist.org/styles/search.css?v=27e1d4246df60da5ffd1146d59a8107e" media="all" rel="stylesheet" type="

1条回答

网友

1楼 · 发布于 2024-06-10 22:45:48

没有带有'row'类的<p>标记，而<p>具有'result-info'类。你知道吗

import requests

url_base = 'http://houston.craigslist.org/search/apa'
params = dict(bedrooms=2)
rsp = requests.get(url_base, params=params)

print(rsp.text[:500])
from bs4 import BeautifulSoup as bs4
html = bs4(rsp.text, 'html.parser')
print(html.prettify()[:1000])

apts = html.find_all('p', attrs={'class': 'result-info'})
print(len(apts))

相关问题更多 >

编程相关推荐

热门问题

热门文章