使用BeautifulSoup将上一个结果保存到每个值的字典

[{'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}, {'link': '/terms'}]

import json import requests from bs4 import BeautifulSoup tags_dict = {} tags_list = [] r = requests.get("http://chicosadventures.com/") soup = BeautifulSoup(r.content, "lxml") for link in soup.find_all('a'): tags_dict['link'] = link.get('href') tags_list.append(tags_dict) dump = json.dumps(tags_list) print(dump)

1条回答

网友

1楼 · 发布于 2024-04-25 22:30:30

你的问题是tags_dict。您只是在列表中一次又一次地存储对一个字典的引用，由于它是一个引用，最后一个值会反映在所有条目中。我修改了它，为每个迭代创建一个新的dict对象，现在它可以正常工作了

import json
import requests
from bs4 import BeautifulSoup

tags_list = []
r = requests.get("http://chicosadventures.com/")
soup = BeautifulSoup(r.content, "lxml")

for link in soup.find_all('a'):
    tags_list.append({"link": link.get('href')})

dump = json.dumps(tags_list)
print(dump)

输出：

[{"link": "/"}, {"link": "/about_chico"}, {"link": "/about_the_author"}, {"link": "/about_the_illustrator"}, {"link": "/chico_in_the_news_"}, {"link": "/order_your_copy"}, {"link": "/contact_us"}, {"link": "/about_chico"}, {"link": "/about_the_author"}, {"link": "/about_the_illustrator"}, {"link": "/chico_in_the_news_"}, {"link": "/order_your_copy"}, {"link": "/contact_us"}, {"link": "/privacy"}, {"link": "javascript:print()"}, {"link": "http://www.ebtech.net/"}, {"link": "/terms"}]

相关问题更多 >

编程相关推荐

热门问题

热门文章