使用pymox模拟urllib2.urlopen和lxml.etree.parse
我正在尝试测试一些使用urllib2和lxml的Python代码。
我看到过很多博客和Stack Overflow上的帖子,大家都想测试urllib2抛出的异常。但我没有看到测试成功调用的例子。
我这样做是对的吗?
有没有人能给我一些建议,让这个工作正常?
这是我目前的代码:
import mox
import urllib
import urllib2
import socket
from lxml import etree
# set up the test
m = mox.Mox()
response = m.CreateMock(urllib.addinfourl)
response.fp = m.CreateMock(socket._fileobject)
response.name = None # Needed because the file name is checked.
response.fp.read().AndReturn("""<?xml version="1.0" encoding="utf-8"?>
<foo>bar</foo>""")
response.geturl().AndReturn("http://rss.slashdot.org/Slashdot/slashdot")
response.read = response.fp.read # Needed since __init__ is not called on addinfourl.
m.StubOutWithMock(urllib2, 'urlopen')
urllib2.urlopen(mox.IgnoreArg(), timeout=10).AndReturn(response)
m.ReplayAll()
# code under test
response2 = urllib2.urlopen("http://rss.slashdot.org/Slashdot/slashdot", timeout=10)
# Note: response2.fp.read() and response2.read() do not behave the same, as defined above.
# In [21]: response2.fp.read()
# Out[21]: '<?xml version="1.0" encoding="utf-8"?>\n<foo>bar</foo>'
# In [22]: response2.read()
# Out[22]: <mox.MockMethod object at 0x97f326c>
xcontent = etree.parse(response2)
# verify test
m.VerifyAll()
它失败了,错误信息是:
Traceback (most recent call last):
File "/home/jon/mox_question.py", line 22, in <module>
xcontent = etree.parse(response2)
File "lxml.etree.pyx", line 2583, in lxml.etree.parse (src/lxml/lxml.etree.c:25057)
File "parser.pxi", line 1487, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:63708)
File "parser.pxi", line 1517, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:63999)
File "parser.pxi", line 1400, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:62985)
File "parser.pxi", line 990, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:60508)
File "parser.pxi", line 542, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:56659)
File "parser.pxi", line 624, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:57472)
File "lxml.etree.pyx", line 235, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:6222)
File "parser.pxi", line 371, in lxml.etree.copyToBuffer (src/lxml/lxml.etree.c:55252)
TypeError: reading from file-like objects must return byte strings or unicode strings
这是因为response.read()没有返回我预期的结果。
3 个回答
0
看起来你的错误和mox没有关系。出错的那一行是从response2读取的,而response2是直接调用slashdot的。也许你可以检查一下这个对象,看看里面有什么内容?
补充说明:我之前没看到上面那行m.StubOutWithMock(urllib2, 'urlopen')
,所以我以为你是在比较两个调用,一个是模拟的(response),一个不是(response2)。下面是更新后的回答。
2
就像彼得说的,我想补充一点,你可能不需要太担心lxml的内部细节,就像你不需要担心urllib2的内部一样。通过模拟lxml.etree,你可以完全隔离出你真正需要测试的代码,也就是你自己写的代码。下面是一个示例,展示了如何做到这一点,同时也演示了如何使用一个模拟对象来测试response.getcode()这个调用。
import mox
from lxml import etree
import urllib2
class TestRssDownload(mox.MoxTestBase):
def test_rss_download(self):
expected_response = self.mox.CreateMockAnything()
self.mox.StubOutWithMock(urllib2, 'urlopen')
self.mox.StubOutWithMock(etree, 'parse')
self.mox.StubOutWithMock(etree, 'iterwalk')
title_elem = self.mox.CreateMock(etree._Element)
title_elem.text = 'some title'
# Set expectations
urllib2.urlopen("http://rss.slashdot.org/Slashdot/slashdot", timeout=10).AndReturn(expected_response)
expected_response.getcode().AndReturn(200)
etree.parse(expected_response).AndReturn('some parsed content')
etree.iterwalk('some parsed content', tag='{http://purl.org/rss/1.0/}title').AndReturn([('end', title_elem),])
# Code under test
self.mox.ReplayAll()
self.production_code()
def production_code(self):
response = urllib2.urlopen("http://rss.slashdot.org/Slashdot/slashdot", timeout=10)
response_code = response.getcode()
if 200 != response_code:
raise Exception('Houston, we have a problem ({0})'.format(response_code))
tree = etree.parse(response)
for ev, elem in etree.iterwalk(tree, tag='{http://purl.org/rss/1.0/}title'):
# Do something with elem.text
print('{0}: {1}'.format(ev, elem.text))
4
我觉得没必要深入研究urllib2的内部细节。这些内容可能并不是你关心的重点。这里有一个简单的方法,可以使用StringIO。关键在于,你想解析的内容只需要在“鸭子类型”上表现得像个文件就行,实际上不需要是一个真正的addinfourl实例。
import StringIO
import mox
import urllib2
from lxml import etree
# set up the test
m = mox.Mox()
response = StringIO.StringIO("""<?xml version="1.0" encoding="utf-8"?>
<foo>bar</foo>""")
m.StubOutWithMock(urllib2, 'urlopen')
urllib2.urlopen(mox.IgnoreArg(), timeout=10).AndReturn(response)
m.ReplayAll()
# code under test
response2 = urllib2.urlopen("http://rss.slashdot.org/Slashdot/slashdot", timeout=10)
xcontent = etree.parse(response2)
# verify test
m.VerifyAll()