如何将“catchall”异常子句应用于复杂的python webscraping脚本？

import csv, urllib2, re def replace(variab): return variab.replace(",", " ") urls = csv.reader(open('input100.txt', 'rb')) #access list of 100 URLs for url in urls: html = urllib2.urlopen(url[0]).read() #get HTML starting with the first URL col7 = re.findall('td7.*?td', html) #use regex to get data from column 7 string = str(col7) #stringify data neat = re.findall('div3.*?div', string) #use regex to get target text result = map(replace, neat) #apply function to remove','s from elements string2 = ", ".join(result) #separate list elements with ', ' for export to csv output = open('output.csv', 'ab') #open file for writing output.write(string2 + '\n') #append output to file and create new line output.close()

2条回答

网友

1楼 · 编辑于 2024-04-19 18:24:04

使for循环的主体变成：

for url in urls:
  try:
    ...the body you have now...
  except Exception, e:
    print>>sys.stderr, "Url %r not processed: error (%s) % (url, e)

（或者，如果您已经在使用标准库的logging模块，请使用logging.error而不是goofyprint>>，如果您已经在使用标准库的logging模块，您应该；-）]）。在

网友

2楼 · 编辑于 2024-04-19 18:24:04

我建议阅读Errors and ExceptionsPython文档，特别是第8.3节处理异常。在

相关问题更多 >

编程相关推荐

热门问题

热门文章