如何使用BeautifulSoup从所有脚本中提取正确的脚本

2条回答

网友

1楼 · 编辑于 2024-04-25 07:19:29

这应该行得通（不是转储到json，如果需要的话，可以打印输出，哦，是的，记住要更改变量，我说的“chooseapath”和“if there any class add it here”）：

 from bs4 import BeuatifulSoup
 import requests
 import json

website = requests.get("https://www.kickstarter.com/projects/louisalberry/louis-alberry-debut-album-uk-european-tour")
soup= BeautifulSoup(website.content, 'lxml')
mytext = soup.findAll("script", {"class": "If theres any class add it here, or else delete this part"})
save_path = 'CHOOSE A PATH'
ogname = "kickstarter_text.json"
completename = os.path.join(save_path, ogname)
with open(completename, "w") as output:
    json.dump(listofurls, output)

网友

2楼 · 编辑于 2024-04-25 07:19:29

您可以收集所有脚本元素并循环。使用请求访问响应对象内容

from bs4 import BeautifulSoup
import requests
res = requests.get("https://www.kickstarter.com/projects/louisalberry/louis-alberry-debut-album-uk-european-tour")
soup = BeautifulSoup(res.content, 'lxml')
scripts = soup.select('script')
scripts = [script for script in scripts]
for script in scripts:
    if 'window.current_project' in script.text:
        print(script)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用BeautifulSoup从所有脚本中提取正确的脚本

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >