这个论坛废弃代码是如何工作的？

import requests, pprint from bs4 import BeautifulSoup as BS url = "https://forums.spacebattles.com/threads/the-wizard-of-woah-and-the-impossible-methods-of-necromancy.337233/" r = requests.get(url) soup = BS(r.content, "html.parser") #To find all posts from a specific user everything below this is for all posts specific_messages = soup.findAll('li', {'data-author': 'The Wizard of Woah!'}) #To find every post from every user posts = {} message_container = soup.find('ol', {'id':'messageList'}) messages = message_container.findAll('li', recursive=0) for message in messages: author = message['data-author'] #or don't encode to utf-8 simply for printing in shell content = message.find('div', {'class':'messageContent'}).text.strip().encode("utf-8") if author in posts: posts[author].append(content) else: posts[author] = [content] pprint.pprint(posts)

1条回答

网友

1楼 · 发布于 2024-04-26 12:07:48

specific_messages=soup.findAll（'li'，{'data-author'：'thewizardofwoah！'}）

soup是解析html所需的BeautifulSoup对象
findAll（）是一个函数，用于查找在html代码中传递的所有参数
李是需要找到的标签
data author是html属性，将在标签中搜索
哇哦的巫师！是作者姓名

因此，基本上，该行正在搜索所有带有属性数据作者的标签，该作者的名字是Woah的向导！

findall返回多行，所以您需要遍历它，这样您就可以得到每一行并将其附加到一个列表中

仅此而已

你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章