我正试图从以下房地产网站获取基本的上市数据:
https://www.propertyfinder.ae/en/search?c=1&l=1&ob=pa&page=2
该网站在XHR选项卡中有一个开放的API,到目前为止运行良好。然而,最近我在尝试向API发出请求时遇到了一个401未经授权的错误。我怀疑这是由于引入了验证cookie。为了解决这个问题,我尝试先向公共网站发出请求,复制cookies,然后用这些cookies向API发出请求,但是没有成功。另外需要注意的是,API仍然可以在浏览器中打开
import requests as rq
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0"}
url = "https://www.propertyfinder.ae/en/api/search?"
params = [
("filter[category_id]", "2"),
("filter[furnished]","0"),
("filter[locations_ids][]","50"),
("filter[price_type]","y"),
("include","properties,properties.property_type,properties.property_images,properties.location_tree,properties.agent,properties.agent.languages,properties.broker,smart_ads,smart_ads.agent,smart_ads.broker,smart_ads.property_type,smart_ads.property_images,smart_ads.location_tree,direct_from_developer,direct_from_developer.property_type,direct_from_developer.property_images,direct_from_developer.location_tree,direct_from_developer.agent,direct_from_developer.broker,cts,cts.agent,cts.broker,cts.property_type,cts.property_images,cts.location_tree,similar_properties,similar_properties.agent,similar_properties.broker,similar_properties.property_type,similar_properties.property_images,similar_properties.location_tree,agent_smart_ads,agent_smart_ads.broker,agent_smart_ads.languages,agent_properties_smart_ads,agent_properties_smart_ads.agent,agent_properties_smart_ads.broker,agent_properties_smart_ads.location_tree,agent_properties_smart_ads.property_type,agent_properties_smart_ads.property_images"),
("page[limit]","25"),
("page[number]","2"),
("sort","nd")
]
#### Makes Request to public facing website and returns cookies in a dictionary
propurl = 'https://www.propertyfinder.ae/en/search?c=1&l=1&ob=pa&page=2'
propbase = rq.get(propurl, headers=headers)
propcook = propbase.cookies
dictcookies = {}
for cookie in propcook:
dictcookies[cookie.name] = cookie.value
# passes the cookies from the public website to the API and attempts to make a request
s
resp = rq.get(url, params=params, headers=headers, cookies=dictcookies)
resultat = []
for el in resp["included"]:
if el["type"] == "property":
data = {
"name": el["attributes"]["name"],
"default_price": el["attributes"]["default_price"],
"bathroom_value": el["attributes"]["bathroom_value"],
"bedroom_value": el["attributes"]["bedroom_value"],
"coordinates" : el["attributes"]["coordinates"]}
resultat.append(data)
print(resultat)
我宁愿继续抓取API,而不是网站本身。如有任何建议,将不胜感激
目前没有回答
相关问题 更多 >
编程相关推荐