Elasticsearch无限大小

9 投票

4 回答

21778 浏览

提问于 2025-04-18 14:44

我正在监听网络流量，并不断将这些流量的数据插入到elasticsearch中。然后我想用我的Python脚本来搜索这些数据。

这是我Python代码的一小部分，

     test = es.search(
     index="argus_data",
     body=dict(query=search_body["query"],
                size= "1000") # I want to do this "unlimited"
  )

  pprint(test)

我不知道我的数据大小，因为我一直在不断地接收新数据。请帮我解决这个问题，感谢！

数据流数据插入 elasticsearch 网络监控实时搜索

4 个回答

当然可以！请看下面的内容：

在编程中，有时候我们需要让程序做一些重复的事情，比如计算、处理数据等。为了让这些重复的工作变得简单，我们可以使用一个叫做“循环”的工具。循环就像是一个指令，让程序不停地执行某个操作，直到满足特定的条件为止。

比如说，如果你想让程序从1加到10，你可以用循环来实现，而不是手动写出1+2+3+4+5+6+7+8+9+10。这样不仅省事，还能减少出错的机会。

循环有很多种类型，最常见的有“for循环”和“while循环”。“for循环”通常用于知道要重复多少次的情况，而“while循环”则是在不知道具体次数的情况下使用，直到某个条件不再满足为止。

总之，循环是编程中一个非常重要的概念，它可以帮助我们更高效地完成任务，让代码更简洁易懂。

# The solution is to calculate the number of documents you have 

test=es.search(index=['indexname'])
size=test['hits']['total']

#size['value'] is the size of your data

# The solution is to calculate the number of documents you have then you make your query based on that size 

res = es.search(index="indexname", body={'size' : size['value'],"query": {"match_all": {}}})``

# you can loop into your data like this 

while i<size['value']:
    print(res['hits']['hits'][i]['_source']['fieldname'])
    i=i+1

回答于 2025-04-18 由 Python大师

分享举报

如果观察数据超过10000条，你就会遇到错误。

回答于 2025-04-18 由 Python大师

分享举报

首先，通过 test['hits']['total'] 获取命中的数量，并把这个数量存到一个变量里，然后把它传给 size。

你需要使用这个查询两次。第一次用它来获取命中的数量（这时候不要传 size 参数）。

test=es.search(index=['test'],doc_type=['test'])
size=test['hits']['total']

第二次再用这个查询，并且这次要带上 size。

test=es.search(index=['test'],doc_type=['test'],"size":size)

回答于 2025-04-18 由 Python大师

分享举报

你可以这样做：

test=es.search(index=['test'],doc_type=['test'],size=1000, from_=0)

然后逐渐调整 from_ 的值，直到你获取到所有的数据。

from_ – 起始偏移量（默认值：0）
Elasticsearch-Py API 文档

回答于 2025-04-18 由 Python大师

分享举报

Elasticsearch无限大小

4 个回答

撰写回答