美化组：使用.h1.text（）时：“TypeError:'str'对象不可调用”

2024-05-23 17:42:25 发布

您现在位置：Python中文网/ 问答频道 /正文

4741

网友

男 | 程序猿一只，喜欢编程写python代码。

编辑-这是一个并发的期货问题，而不是BS4问题。Concurrent futures在检索数据后返回空列表-这导致BS4出现非类型错误

我试图使用Beautiful Soup从URL列表中刮取H1，但在其中一个URL上得到了错误TypeError: 'str' object is not callable

如果我打印输出，我可以看到我在错误之前检索到了3个h1

如果我删除.h1.text.strip()，我会得到一个不同的错误，尽管奇怪的是它会打印html 4次而不是3次

例如，我将bsh_h1 = bsh.h1.text.strip()更改为bsh_h1 = bsh，它运行了四次，并产生了错误：TypeError: unhashable type: 'ResultSet'

我尝试过放置try/except块，但它似乎会在代码的其他地方产生错误，所以我觉得我遗漏了一些基本的东西。也许BS4正在返回NoneType，我需要跳过它

这是我的密码

import concurrent.futures
from bs4 import BeautifulSoup
from urllib.request import urlopen

CONNECTIONS = 1

archive_url_list = [
    "https://web.archive.org/web/20171220015929/http://www.manueldrivingschool.co.uk:80/prices.php",
    "https://web.archive.org/web/20160313085709/http://www.manueldrivingschool.co.uk/lessons_prices.php",
    "https://web.archive.org/web/20171220002420/http://www.manueldrivingschool.co.uk:80/prices",
    "https://web.archive.org/web/20201202094502/https://www.manueldrivingschool.co.uk/success",
]

archive_h1_list = []
def get_archive_h1(h1_url):
    html = urlopen(h1_url)
    bsh = BeautifulSoup(html.read(), 'lxml')
    bsh = bsh.h1.text.strip()
    return bsh.h1.text.strip()

def concurrent_calls():
    with concurrent.futures.ThreadPoolExecutor(max_workers=CONNECTIONS) as executor:
        f1 = executor.map(get_archive_h1, archive_url_list)
        for future in concurrent.futures.as_completed(f1):
            try:
                data = future.result()
                archive_h1_list.append(data)
            except Exception:
                archive_h1_list.append("No Data Received!")
                pass

if __name__ == '__main__':
    concurrent_calls()
    print(archive_h1_list)

PS：我正在使用并发未来进行多线程处理。我最初认为这就是问题所在，但现在我倾向于BS4

Tags： text https org web url www 错误 h1

1条回答

网友

1楼 · 发布于 2024-05-23 17:42:25

您正在从bsh提取字符串，然后尝试从中访问h1，这将失败，因为该字符串没有h1方法/属性

bsh = bsh.h1.text.strip()
return bsh.h1.text.strip()

相反，只要做：

return bsh.h1.text.strip()

美化组：使用.h1.text（）时：“TypeError:'str'对象不可调用”

相关问题更多 >

编程相关推荐

热门问题

热门文章

美化组：使用.h1.text（）时：“TypeError:'str'对象不可调用”

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >