如何从弹性搜索中获取唯一的分数搜索结果

2024-04-18 12:15:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图显示弹性搜索中唯一的“描述”行。我正在尝试获取多个具有相同描述的重复行中的一行。我不想聚合,因为我还需要来自其他列的其他信息。下面的代码是我试图实现的,但没有实现

  res = esconnection.search(index='data', body={
        # "query": {
        #     "match": {"description": query_input}
        # },
        # "size": 30

        "query": {
            "multi_match": {
                "description": query_input
            }
        },
        "aggs": {
            "top-descriptions": {
                "terms": {
                    "field": "description"
                },
                "aggs": {
                    "top_description_hits": {
                        "top_hits": {
                            "sort": [
                                {
                                    "_score": {
                                        "order": "desc"
                                    }
                                }
                            ],
                            "size": 1
                        }
                    }
                }
            }

        }
    })
    return res["hits"]["hits"]

1条回答
网友
1楼 · 发布于 2024-04-18 12:15:54

Field collapsing可用于对字段上的文档进行分组

Allows to collapse search results based on field values. The collapsing is done by selecting only the top sorted document per collapse key. For instance the query below retrieves the best tweet for each user and sorts them by number of likes.

数据

[
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "P1lTjHEBF99yL6wF31iA",
        "_score" : 1.0,
        "_source" : {
          "description" : "brown fox"
        }
      },
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "QFlTjHEBF99yL6wF8liO",
        "_score" : 1.0,
        "_source" : {
          "description" : "brown fox"
        }
      },
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "QVlTjHEBF99yL6wF91gU",
        "_score" : 1.0,
        "_source" : {
          "description" : "brown fox"
        }
      },
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "QllUjHEBF99yL6wFFFh5",
        "_score" : 1.0,
        "_source" : {
          "description" : "brown dog"
        }
      },
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "Q1lUjHEBF99yL6wFGFhQ",
        "_score" : 1.0,
        "_source" : {
          "description" : "brown dog"
        }
      }
    ]

我有三份描述为“棕色狐狸”的文件和两份描述为“棕色狗”的文件

查询:

{
  "query": {
    "match": {
      "description": {
        "query": "brown"
      }
    }
  },
  "collapse": {
    "field": "description.keyword"  > notice keyword
  }
}

结果:

"hits" : [
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "P1lTjHEBF99yL6wF31iA",
        "_score" : 0.087011375,
        "_source" : {
          "description" : "brown fox"
        },
        "fields" : {
          "description.keyword" : [
            "brown fox"
          ]
        }
      },
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "QllUjHEBF99yL6wFFFh5",
        "_score" : 0.087011375,
        "_source" : {
          "description" : "brown dog"
        },
        "fields" : {
          "description.keyword" : [
            "brown dog"
          ]
        }
      }
    ]

仅返回2个文档。 字段折叠提供了“内部点击”等功能:如果您想查看组下的文档。使用“排序”可以决定显示哪个文档

相关问题 更多 >