在加载MySQL数据库时出现“IndexError: list index out of range”
我在运行我的代码时遇到了以下错误代码。这个错误不是立刻出现的,而是随机在2到7小时后才会发生。在出现错误之前,在线流媒体的播放和写入数据库都没有问题。
错误信息:
Traceback (most recent call last):
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 78, in <module>
main()
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 63, in main
feed_iii = feed_load_iii(feed_url_iii)
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 44, in feed_load_iii
in feedparser.parse(feed_iii).entries]
IndexError: list index out of range
这是我的代码:
import feedparser
import MySQLdb
import time
from cookielib import CookieJar
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="root", # your username - SELECT * FROM mysql.user
passwd="****", # your password
db="sentimentanalysis_unicode",
charset="utf8") # name of the data base
cur = db.cursor()
cur.execute("SET NAMES utf8")
cur.execute("SET CHARACTER SET utf8")
cur.execute("SET character_set_connection=utf8")
cur.execute("DROP TABLE IF EXISTS feeddata_iii")
sql_iii = """CREATE TABLE feeddata_iii(III_ID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(III_ID),III_UnixTimesstamp integer,III_Timestamp varchar(255),III_Source varchar(255),III_Title varchar(255),III_Text TEXT,III_Link varchar(255),III_Epic varchar(255),III_CommentNr integer,III_Author varchar(255))"""
cur.execute(sql_iii)
def feed_load_iii(feed_iii):
return [(time.time(),
entry.published,
'iii',
entry.title,
entry.summary,
entry.link,
(entry.link.split('=cotn:')[1]).split('.L&id=')[0],
(entry.link.split('.L&id=')[1]).split('&display=')[0],
entry.author)
for entry
in feedparser.parse(feed_iii).entries]
def main():
feed_url_iii = "http://www.iii.co.uk/site_wide_discussions/site_wide_rss2.epl"
feed_iii = feed_load_iii(feed_url_iii)
print feed_iii[1][1]
for item in feed_iii:
cur.execute("""INSERT INTO feeddata_iii(III_UnixTimesstamp, III_Timestamp, III_Source, III_Title, III_Text, III_Link, III_Epic, III_CommentNr, III_Author) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)""",item)
db.commit()
if __name__ == "__main__":
while True:
main()
time.sleep(240)
如果你需要更多信息,请随时问我。我需要你的帮助!
来自伦敦的感谢和问候!
1 个回答
1
简单来说,你的程序在处理格式不正确的数据时不够强大。
你的代码对数据的结构有很明确的假设,如果数据不是那样的结构,它就无法处理。你需要找到数据格式不正确的情况,然后采取其他措施。
一种比较粗糙的方法是捕捉当前出现的错误,你可以用类似下面的方式来实现:
try:
feed_iii = feed_load_iii(feed_url_iii)
except IndexError:
# do something to report or handle the data format problem