在contentType为application/ld+json的位置删除脚本标记

2024-06-16 11:04:30 发布

您现在位置:Python中文网/ 问答频道 /正文

错误在jsn=json.loads(data.string)中。我想刮评论员和评级,但getting string as attribute error。你能帮我吗

代码:

from bs4 import BeautifulSoup
import json
import requests
import pandas as pd

r= requests.get('https://www.zomato.com/beirut/divvy-ashrafieh/reviews')
soup = BeautifulSoup(r.text, "lxml")


data = soup.find('script', {"type": "application/ld+json"})
jsn = json.loads(data.string)

print(jsn)

Tags: importjsondatastringas错误attributeerror
1条回答
网友
1楼 · 发布于 2024-06-16 11:04:30

尝试设置User-AgentHTTP头:

from bs4 import BeautifulSoup
import json
import requests
import pandas as pd

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0"
}

r = requests.get(
    "https://www.zomato.com/beirut/divvy-ashrafieh/reviews", headers=headers
)
soup = BeautifulSoup(r.text, "lxml")

all_data = soup.find_all("script", {"type": "application/ld+json"})

for data in all_data:
    jsn = json.loads(data.string)
    print(json.dumps(jsn, indent=4))

印刷品:

{
    "@context": "http://schema.org",
    "@type": "WebSite",
    "name": "Zomato",
    "url": "https://www.zomato.com"
}
{
    "@context": "https://schema.org",
    "@type": "Restaurant",
    "name": "DIVVY",
    "url": "/beirut/divvy-ashrafieh/reviews",
    "openingHours": "12noon \u2013 11:30pm (Today)",
    "hasmap": "https://maps.zomato.com/php/staticmap?center=33.8882180000,35.5199140000&maptype=zomato&markers=33.8882180000,35.5199140000,pin_res32&sensor=false&scale=2&zoom=16&language=en&size=240x150&size=400x240",
    "menu": "/beirut/divvy-ashrafieh/reviews/menu",
    "address": {
        "@type": "PostalAddress",
        "streetAddress": "ABC Ashrafieh, Level 3, Furn el Hayek Street, Ashrafieh, Beirut District",
        "addressLocality": "ABC Ashrafieh, Beirut District",
        "addressRegion": "Beirut District",
        "postalCode": "",
        "addressCountry": "Lebanon"
    },

...and so on.

相关问题 更多 >