索引器错误:当我同时运行两个python脚本时,list index超出范围

2024-06-16 12:43:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从一个网站抓取数据,这是按不同的省份排序。为了更快,我尝试运行两个或更多python脚本来同时对每个省进行爬网。两个脚本之间的唯一区别是它们抓取不同的url集。每次他们在前30秒或1分钟表现良好。但后来我在每个脚本中都出现了以下错误,而且每次都会同时出现:

Traceback (most recent call last):
  File "EOLGrades-A.py", line 157, in <module>
    dealEachCollege(url_2,tableName2,cNameList[i])
  File "EOLGrades-A.py", line 58, in dealEachCollege
    insertData(getData(sp),tableName,collegeName)
  File "EOLGrades-A.py", line 33, in getData
    if FAtd[x+j].text == '--' or FAtd[x+j].text ==' ':
IndexError: list index out of range

Traceback (most recent call last):
  File "EOLGrades-B.py", line 157, in <module>
    dealEachCollege(url_2,tableName2,cNameList[i])
  File "EOLGrades-B.py", line 58, in dealEachCollege
    insertData(getData(sp),tableName,collegeName)
  File "EOLGrades-B.py", line 33, in getData
    if FAtd[x+j].text == '--' or FAtd[x+j].text ==' ':
IndexError: list index out of range

我的getData方法:

^{pr2}$

count来自getCount方法:

def getCount(soup):
    FAtr = soup.find_all(name='tr')
    count = len(FAtr) - 1
    return count

dealEachCollege方法:

def dealEachCollege(URL,tableName,collegeName):
    page = s.get(URL,headers=headers)
    page.encoding='utf-8'
    sp = BeautifulSoup(page.text,"html.parser")
    count = getCount(sp)
    insertData(getData(sp,count),tableName,collegeName,count)
    page.close()

insertData方法:

def insertData(dataList,tableName,collegeName,count):
    try:
        m=dataList
        if m[0]=='0':
            return
        for i in range(count):
            cursor.execute("INSERT INTO " + tableName + " VALUES (%s,%s,%s,%s,%s,%s,%s)",(collegeName,m[i][0],m[i][1],m[i][2],m[i][3],m[i][4],m[i][5]))
            conn.commit() 
        print ("Successfully inserted into %s." %tableName)
    except pymysql.Error as e:
        print ("Mysql Error %d: %s" %(e.args[0], e.args[1]))

当我只运行一个脚本时,并没有出现错误。 谁能告诉我怎么修理它吗?或者其他方法同时运行2个python爬虫程序?非常感谢!在


Tags: 方法textinpy脚本countlinesp