如何在PyES中使用ResultSet

1 投票
2 回答
1477 浏览
提问于 2025-04-17 17:59

我正在使用PyES在Python中使用ElasticSearch。通常,我会按照以下格式构建我的查询:

# Create connection to server.
conn = ES('127.0.0.1:9200')

# Create a filter to select documents with 'stuff' in the title.
myFilter = TermFilter("title", "stuff")

# Create query.
q = FilteredQuery(MatchAllQuery(), myFilter).search()

# Execute the query.
results = conn.search(query=q, indices=['my-index'])

print type(results)
# > <class 'pyes.es.ResultSet'>

这工作得很好。但当查询返回大量文档时,我就遇到问题了。将结果转换为字典列表需要消耗很多计算资源,所以我想直接返回已经是字典格式的查询结果。我找到了一些文档:

http://pyes.readthedocs.org/en/latest/faq.html#id3 http://pyes.readthedocs.org/en/latest/references/pyes.es.html#pyes.es.ResultSet https://github.com/aparo/pyes/blob/master/pyes/es.py(第1304行)

但是我不知道具体该怎么做。根据之前的链接,我尝试了这个:

from pyes import *
from pyes.query import *
from pyes.es import ResultSet
from pyes.connection import connect

# Create connection to server.
c = connect(servers=['127.0.0.1:9200'])

# Create a filter to select documents with 'stuff' in the title.
myFilter = TermFilter("title", "stuff")

# Create query / Search object.
q = FilteredQuery(MatchAllQuery(), myFilter).search()

# (How to) create the model ?
mymodel = lambda x, y: y

# Execute the query.
# class pyes.es.ResultSet(connection, search, indices=None, doc_types=None,
# query_params=None, auto_fix_keys=False, auto_clean_highlight=False, model=None)

resSet = ResultSet(connection=c, search=q, indices=['my-index'], model=mymodel)
# > resSet = ResultSet(connection=c, search=q, indices=['my-index'], model=mymodel)
# > TypeError: __init__() got an unexpected keyword argument 'search'

有没有人能够从ResultSet中获取字典?如果有好的建议可以高效地将ResultSet转换为字典(列表),我也非常感激。

2 个回答

0

其实没那么复杂:只需要遍历结果集就可以了。比如可以用一个for循环来实现:

for item in results:
   print item
1

我尝试了很多方法直接把结果集(ResultSet)转换成字典(dict),但都没有成功。最近我发现最好的办法是把结果集里的每一项添加到另一个列表或字典里。结果集本身就像一个字典,里面包含了每一项。

下面是我使用的方法:

#create a response dictionary
response = {"status_code": 200, "message": "Successful", "content": []}

#set restul set to content of response
response["content"] = [result for result in resultset]

#return a json object
return json.dumps(response)

撰写回答