为什么这个Python脚本只读取最后一个RSS帖子到文件中？

0 投票

2 回答

625 浏览

提问于 2025-04-15 13:05

我正在尝试修复一个Python脚本，这个脚本会从一个特定的RSS源获取帖子，然后把它们处理后输入到一个文本文件里。下面你可以看到，有两个主要的打印功能。第一个功能在运行时只会在命令行中打印，但它会显示所有的帖子，这正是我想要的。问题出在第二个功能上。它只会把RSS源中的最后一条帖子打印到文本文件里，而不是像第一个功能那样打印所有的帖子。我还尝试把第二个功能（f = open()）做成和第一个一样，用%s代替新的打印行变量。

如果有人能告诉我为什么这个脚本只把RSS源中的一条（最后一条）帖子写入文本文件，而在命令行中却能显示所有的帖子，以及我需要做什么修改来修复这个问题，我会非常感激的 :)

以下是代码：

import urllib
import sys
import xml.dom.minidom

#The url of the feed
address = 'http://www.vg.no/export/Alle/rdf.hbs?kat=nyheter'

#Our actual xml document
document = xml.dom.minidom.parse(urllib.urlopen(address))
for item in document.getElementsByTagName('item'):
    title = item.getElementsByTagName('title')[0].firstChild.data
    link = item.getElementsByTagName('link')[0].firstChild.data
    description = item.getElementsByTagName('description')[0].firstChild.data

    str = link.strip("http://go.vg.no/cgi-bin/go.cgi/rssart/")
    print "\n"
    print "------------------------------------------------------------------"
    print '''"%s"\n\n%s\n\n(%s)''' % (title.encode('UTF8', 'replace'),
                                            description.encode('UTF8','replace'),
                                            str.encode('UTF8','replace'))
    print "------------------------------------------------------------------"
    print "\n"

f = open('lawl.txt','w')
print >>f, "----------------------Nyeste paa VG-------------------------------"
print >>f, title.encode('UTF8','replace')
print >>f, description.encode('UTF8','replace')
print >>f, str.encode('UTF8','replace')
print >>f, "------------------------------------------------------------------"
print >>f, "\n"

数据处理编程错误文件处理数据读取文本文件 rss 脚本调试输出问题

2 个回答

你要遍历所有的帖子，把它们的属性赋值给一些变量，然后在终端上打印出来。

接着，你把这些变量（它们其实保存的是最后一次赋值的结果）打印到文件里。这样你就只得到了一个帖子。

如果你想要多个帖子，就需要再遍历一次。

回答于 2025-04-15 由 Python大师

分享举报

你的 print >>f 语句是在 for 循环之后，所以它们只会执行一次，并且只处理你最后保存到 title、description 和 str 里的数据。

你应该在 for 循环之前打开文件，然后把 print >>f 的语句放到循环里面。

import urllib
import sys
import xml.dom.minidom

#The url of the feed
address = 'http://www.vg.no/export/Alle/rdf.hbs?kat=nyheter'

f = open('lawl.txt','w')

#Our actual xml document
document = xml.dom.minidom.parse(urllib.urlopen(address))
for item in document.getElementsByTagName('item'):
    title = item.getElementsByTagName('title')[0].firstChild.data
    link = item.getElementsByTagName('link')[0].firstChild.data
    description = item.getElementsByTagName('description')[0].firstChild.data

    str = link.strip("http://go.vg.no/cgi-bin/go.cgi/rssart/")
    print "\n"
    print "------------------------------------------------------------------"
    print '''"%s"\n\n%s\n\n(%s)''' % (title.encode('UTF8', 'replace'),
                                            description.encode('UTF8','replace'),
                                            str.encode('UTF8','replace'))
    print "------------------------------------------------------------------"
    print "\n"

    print >>f, "----------------------Nyeste paa VG-------------------------------"
    print >>f, title.encode('UTF8','replace')
    print >>f, description.encode('UTF8','replace')
    print >>f, str.encode('UTF8','replace')
    print >>f, "------------------------------------------------------------------"
    print >>f, "\n"

回答于 2025-04-15 由 Python大师

分享举报

为什么这个Python脚本只读取最后一个RSS帖子到文件中？

2 个回答

撰写回答