如何使用bs4从<script>标记中提取文本？

2024-04-25 03:48:43 发布

您现在位置：Python中文网/ 问答频道 /正文

9479

网友

男 | 程序猿一只，喜欢编程写python代码。

我试图用BS4从标记中提取一些文本，但每次运行脚本时都会遇到一个TypeError

我试过使用几个不同的解析器，但它们都返回相同的类型错误

我的python代码是：

s = requests.Session()
r = (s.get(url, headers=headers))
soup = BeautifulSoup(r.content, 'html5lib')
profile = soup.find('script', attrs={'name': 'window.profile'})['value']

我想搜集的HTML是：

<script>
// Profile helper.
window.profile = 'PROFILEIDHERE';
</script>

我的代码的预期结果是将'window.profile'的值赋给变量'profile'，但每次运行脚本时，我都会得到一个TypeError

Tags：代码标记文本脚本解析器类型错误 script

1条回答

网友

1楼 · 发布于 2024-04-25 03:48:43

可以使用get_text（）获取标记的文本值：

allScripts = soup.find_all("script")
for script in allScripts:
    scriptText = script.get_text()
    scriptTextValue = scriptText.split("'")[1]
    print(scriptTextValue)

如何使用bs4从<script>标记中提取文本？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用bs4从<script>标记中提取文本？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >