修改html文件（查找并替换href url并保存）

# Dict which relates the original links with my the ones to replace them links_dict = { original_link1 : my_link1 , original_link2 : my_link2 } # and so on.. # Get a list of links to loop and find them into the html file original_links = links_dict .keys() soup = BeautifulSoup(open(html_file), "html.parser",encoding="utf8") # This part is where I am stuck, the theory is loop through 'original_links' and if any of those links is found, replace it with the one I have in 'links_dict' for link in soup.find_all('a',href=True): if link['href'] in links_dict: link['href'] = link['href'].replace(link['href'],links_dict[link['href']] with open("new_file.html", "w",encoding="utf8") as file: file.write(str(soup))

1条回答

网友

1楼 · 发布于 2024-04-26 18:54:57

一旦你有一些汤要处理，你应该寻找'a'元素，然后检查它们的'href'属性，如果它们与dict中的匹配，根据需要替换。你知道吗

我会制作'original\u link1'等regexp，这样你就可以很容易地匹配了。你知道吗

碰巧，我相信你的问题已经回答了，请看BeautifulSoup - modifying all links in a piece of HTML?

相关问题更多 >

编程相关推荐

热门问题

热门文章