Yahoo BOSS Python库,ExpatError

0 投票
2 回答
1335 浏览
提问于 2025-04-15 14:19

我尝试安装Yahoo BOSS的混合框架,但在运行提供的示例时遇到了问题。示例1、2、5和6都能正常工作,但示例3和4出现了Expat错误。以下是ex3.py的输出:

gpython examples/ex3.py
    examples/ex3.py:33: Warning: 'as' will become a reserved keyword in Python 2.6
Traceback (most recent call last):
  File "examples/ex3.py", line 27, in <module>
    digg = db.select(name="dg", udf=titlef, url="http://digg.com/rss_search?search=google+android&area=dig&type=both&section=news")
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 214, in select
    tb = create(name, data=data, url=url, keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 201, in create
    return WebTable(name, d=rest.load(url), keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/crawl/rest.py", line 38, in load
    return xml2dict.fromstring(dl)
  File "/usr/lib/python2.5/site-packages/yos/crawl/xml2dict.py", line 41, in fromstring
    t = ET.fromstring(s)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 963, in XML
    parser.feed(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1245, in feed
    self._parser.Parse(data, 0)
    xml.parsers.expat.ExpatError: syntax error: line 1, column 0

看起来这两个示例在尝试查询Digg.com时都失败了。以下是ex3.py代码中构建的查询:

diggf = lambda r: {"title": r["title"]["value"], "diggs": int(r["diggCount"]["value"])}
digg = db.select(name="dg", udf=diggf, url="http://digg.com/rss_search?search=google+android&area=dig&type=both&section=news")

2 个回答

0

我觉得这个例子里肯定有个错误:它得到的是一个JSON格式的结果(其实如果你把那个网址复制到浏览器里,你会下载到一个名为search.json的文件,文件内容是从

{"results":[{"profile_image_url":
"http://a3.twimg.com/profile_images/255524395/KEN_OMALLEY_REVISED_normal.jpg",
"created_at":"Mon, 14 Sep 2009 14:52:07 +0000","from_user":"twilightlords",

开始的,也就是完全正常的JSON格式;但是接下来它不是用像json或simplejson这样的模块来解析,而是试图把它当作XML来解析——显然,这样做是失败的。

我认为解决办法(可能需要告诉维护这段代码的人,让他们能把这个问题修复)是要么请求XML格式的输出,而不是JSON,要么用合适的方法来解析得到的JSON,而不是试图把它当作XML来看(我不太确定这两种修改该怎么最好地实现,因为我对那段代码不太熟悉)。

1

问题出在搜索字符串上。它应该是 "s=",而不是 "search="。

撰写回答