不打印链接

2024-04-25 04:30:54 发布

男 | 程序猿一只，喜欢编程写python代码。

我要取消rss

from bs4 import BeautifulSoup
import urllib2
import requests


url = raw_input("");
re=requests.get(url);

def rss_get_items(url):    
    request = urllib2.Request(url)
    response = urllib2.urlopen(request)
    soup = BeautifulSoup(response)

    for item_node in soup.find_all('item'):
        item = {}
        for subitem_node in item_node.findChildren():
            key = subitem_node.name
            value = subitem_node.text
            item[key] = value
        yield item

if __name__ == '__main__':
    for item in rss_get_items(url):
        print item['title']
        print item['pubdate']
        print item['link']
        print item['guid']
        print item['description']

我得到了这个脚本脚本的一部分，从一个答案张贴在这个网站上，我只是给这个家伙学分。我忘了原来的帖子和发布它的用户的名字。无论如何，我不能打印链接，它只是不工作，我想知道为什么。你知道吗

我可以照医生说的做

for link in soup.find_all('a'):
    print(link.get('href'))
# http://example.com/elsie
# http://example.com/lacie
# http://example.com/tillie

这是可行的，但出于好奇，我只想知道第一种方法是打印链接，只是出于好奇。你知道吗

我在用aljazeera.com rss

Tags： in import com node http url for get

1条回答

网友

1楼 · 发布于 2024-04-25 04:30:54

在抓取xml内容时，使用xml解析器创建soup。你知道吗

soup = BeautifulSoup(response, 'xml')

不打印链接

相关问题更多 >

编程相关推荐

热门问题

热门文章

不打印链接

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >