无法处理UrlFetch中的DeadlineExceededError

16 投票
2 回答
6274 浏览
提问于 2025-04-16 16:08

我有一个基本的工具类,它可以并行获取(可能是)缩短的链接,并返回一个包含最终链接的字典。这个工具类使用了一个叫做 wait_any 的功能,这个功能在这篇 博客文章 中有介绍。

class UrlFetcher(object):

  @classmethod
  def fetch_urls(cls,url_list):
    rpcs = []
    for url in url_list:
      rpc = urlfetch.create_rpc(deadline=5.0)
      urlfetch.make_fetch_call(rpc, url,method = urlfetch.HEAD)
      rpcs.append(rpc)

    result = {}
    while len(rpcs) > 0:
      rpc = apiproxy_stub_map.UserRPC.wait_any(rpcs)
      rpcs.remove(rpc)
      request_url = rpc.request.url()
      try:
        final_url = rpc.get_result().final_url
      except AttributeError:
        final_url = request_url
      except DeadlineExceededError:
        logging.error('Handling DeadlineExceededError for url: %s' %request_url)
        final_url  = None
      except (DownloadError,InvalidURLError):
        final_url  = None        
      except UnicodeDecodeError: #Funky url with very evil characters
        final_url = unicode(rpc.get_result().final_url,'utf-8')

      result[request_url] = final_url

    logging.info('Returning results: %s' %result)
    return result

即使我尝试处理 DeadlineExceededError(超时错误),应用程序的日志显示的情况却不一样。

2011-04-20 17:06:17.755
UrlFetchWorker started
E 2011-04-20 17:06:22.769
The API call urlfetch.Fetch() took too long to respond and was cancelled.
Traceback (most recent call last):
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 636, in __call__
    handler.post(*groups)
  File "/base/data/home/apps/tweethitapp/1.349863151373877476/tweethit/handlers/taskworker.py", line 80, in post
    result_dict = UrlFetcher.fetch_urls(fetch_targets)
  File "/base/data/home/apps/tweethitapp/1.349863151373877476/tweethit/utils/rpc.py", line 98, in fetch_urls
    final_url = rpc.get_result().final_url
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 592, in get_result
    return self.__get_result_hook(self)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/urlfetch.py", line 345, in _get_fetch_result
    rpc.check_success()
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 558, in check_success
    self.__rpc.CheckSuccess()
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 133, in CheckSuccess
    raise self.exception
DeadlineExceededError: The API call urlfetch.Fetch() took too long to respond and was cancelled.
W 2011-04-20 17:06:22.858
Found 1 RPC request(s) without matching response (presumably due to timeouts or other errors)

我在这里漏掉了什么呢?有没有其他方法可以处理 DeadlineExceededError?

或者说,DeadlineExceededError 可能有不同的类型,而我导入了错误的类型?我使用的是:from google.appengine.runtime import DeadlineExceededError

2 个回答

11

正如你猜测的和其他人提到的,你需要一个不同的 DeadlineExceededError

来自 https://developers.google.com/appengine/articles/deadlineexceedederrors,日期是2012年6月:

目前,对于Python运行环境,有几个叫做DeadlineExceededError的错误:

google.appengine.runtime.DeadlineExceededError:当整体请求超时时会抛出这个错误,通常是在60秒后,或者对于任务队列请求是在10分钟后;

google.appengine.runtime.apiproxy_errors.DeadlineExceededError:如果一个RPC超出了它的截止时间,就会抛出这个错误。这个截止时间通常是5秒,但对于某些API可以通过'deadline'选项进行设置;

google.appengine.api.urlfetch_errors.DeadlineExceededError:如果URLFetch超时,就会抛出这个错误。

捕获 google.appengine.api.urlfetch_errors.DeadlineExceededError 对我来说似乎有效。还值得注意的是(至少在开发应用服务器1.7.1中),urlfetch_errors.DeadlineExceededErrorDownloadError 的一个子类,这样理解起来也很合理:

class DeadlineExceededError(DownloadError):
  """Raised when we could not fetch the URL because the deadline was exceeded.

  This can occur with either the client-supplied 'deadline' or the system
  default, if the client does not supply a 'deadline' parameter.
  """
21

根据google.appengine.runtime.DeadlineExceededError的内联文档:

当请求达到总时间限制时,会抛出这个异常。

不要和runtime.apiproxy_errors.DeadlineExceededError搞混。后者是在单个API调用花费时间过长时抛出的。

这也很好地说明了为什么你应该使用带有前缀的导入方式(比如from google.appengine import runtime,然后引用runtime.DeadlineExceededError)!

撰写回答