当我和BeautifulSoup在网上乱搞时，我可以接受还是忽略谷歌隐私声明？

from bs4 import BeautifulSoup import requests headers = {'User-Agent': 'Mozilla/5.0'} r = requests.get("https://www.google.com/news", headers=headers) soup = BeautifulSoup(r.content, 'html.parser') print(soup.prettify())

<title> Before you continue </title> <meta content="initial-scale=1, maximum-scale=5, width=device-width" name="viewport"/> <link href="//www.google.com/favicon.ico" rel="shortcut icon"/> </head> <body> <div class="signin"> <a class="button" href="https://accounts.google.com/ServiceLogin?hl=en-US&continue=https://news.google.com/topics/CAAqBwgKMKHQ9Qowlc7cAg&gae=cb-"> Sign in </a> </div> <div class="box"> <img alt="Google" height="28" src="//www.gstatic.com/images/branding/googlelogo/1x/googlelogo_color_68x28dp.png" srcset="//www.gstatic.com/images/branding/googlelogo/2x/googlelogo_color_68x28dp.png 2x" width="68"/> <div class="productLogoContainer"> <img alt="" aria-hidden="true" class="image" height="100%" src="https://www.gstatic.com/ac/cb/scene_cookie_wall_search_v2.svg" width="100%"/> </div>

1条回答

网友

1楼 · 发布于 2024-05-26 14:21:33

您可以将CONSENTcookie设置为不获取，然后继续“页面：

import requests
from bs4 import BeautifulSoup

headers = {"User-Agent": "Mozilla/5.0"}
cookies = {"CONSENT": "YES+cb.20210720-07-p0.en+FX+410"}
r = requests.get(
    "https://www.google.com/news", headers=headers, cookies=cookies
)
soup = BeautifulSoup(r.content, "html.parser")
print(soup.prettify())

相关问题更多 >

编程相关推荐

热门问题

热门文章