Azure表存储多行查询性能

Time elapsed with multi threaded implementation: 0.10776209831237793 Time elapsed with single query: 0.2323908805847168 table item count since x: 5 Both queries returned same amount of results Press enter to retry test! Time elapsed with multi threaded implementation: 0.0897986888885498 Time elapsed with single query: 0.21547174453735352 table item count since x: 5 Both queries returned same amount of results Press enter to retry test! Time elapsed with multi threaded implementation: 0.08280491828918457 Time elapsed with single query: 3.2932426929473877 table item count since x: 5 Both queries returned same amount of results Press enter to retry test! Time elapsed with multi threaded implementation: 0.07794523239135742 Time elapsed with single query: 1.4898555278778076 table item count since x: 5 Both queries returned same amount of results Press enter to retry test! Time elapsed with multi threaded implementation: 0.07962584495544434 Time elapsed with single query: 0.20011520385742188 table item count since x: 5 Both queries returned same amount of results Press enter to retry test!

TableOperation getOperation = TableOperation.Retrieve<HqrScreenshotItemTableEntity>(partitionKey, id.ToString()); TableResult result = await table.ExecuteAsync(getOperation);

private IEnumerable<MyTableEntity> GetBatchedItemsTableResult(Guid[] ids, string applicationLink) { var table = InitializeTableStorage(); TableQuery<MyTableEntity> itemsQuery= new TableQuery<MyTableEntity>().Where(TableQueryConstructor(ids, applicationLink)); IEnumerable<MyTableEntity> result = table.ExecuteQuery(itemsQuery); return result; } public string TableQueryConstructor(Guid[] ids, string applicationLink) { var fullQuery = new StringBuilder(); foreach (var id in ids) { // Encode link before setting to partition key as REST GET requests // do not accept non encoded URL params by default) partitionKey = HttpUtility.UrlEncode(applicationLink); // Create query for single row in a requested partition string queryForRow = TableQuery.CombineFilters( TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionKey), TableOperators.And, TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.Equal, id.ToString())); if (fullQuery.Length == 0) { // append query for first row fullQuery.Append(queryForRow); } else { // Append query for subsequent rows with or operator to make queries independent of each other. fullQuery.Append($" {TableOperators.Or} "); fullQuery.Append(queryForRow); } } return fullQuery.ToString(); }

3条回答

网友

1楼 · 编辑于 2024-05-13 20:17:46

张贴作为答复，因为它越来越大的评论。在

您是否可以尝试将查询更改为以下内容：

(PartitionKey eq 'http%3a%2f%2fsome_website.azurewebsites.net%2fApiName_ed6d31b0' and RowKey eq 'ed6d31b0-d2a3-4f18-9d16-7f72cbc88cb3') or (PartitionKey eq 'http%3a%2f%2fsome_website.azurewebsites.net%2fApiName_9be86f34' and RowKey eq '9be86f34-865b-4c0f-8ab0-decf928dc4fc') or (PartitionKey eq 'http%3a%2f%2fsome_website.azurewebsites.net%2fApiName_97af3bdc' and RowKey eq '97af3bdc-b827-4451-9cc4-a8e7c1190d17') or (PartitionKey eq 'http%3a%2f%2fsome_website.azurewebsites.net%2fApiName_9d557b56' and RowKey eq '9d557b56-279e-47fa-a104-c3ccbcc9b023') or (PartitionKey eq 'http%3a%2f%2fsome_website.azurewebsites.net%2fApiName_e251a31a' and RowKey eq 'e251a31a-1aaa-40a8-8cde-45134550235c')

网友

2楼 · 编辑于 2024-05-13 20:17:46

我认为您应该使用多线程实现，因为它包含多个点查询。在单个查询中执行all-in-single查询可能会导致表扫描。正如official doc提到的：

Using an "or" to specify a filter based on RowKey values results in a partition scan and is not treated as a range query. Therefore, you should avoid queries that use filters such as: $filter=PartitionKey eq 'Sales' and (RowKey eq '121' or RowKey eq '322')

您可能认为上面的示例是两个点查询，但实际上它会导致一个分区扫描。在

网友

3楼 · 编辑于 2024-05-13 20:17:46

对我来说，这里的答案似乎是，在表存储上执行查询并没有像您预期的那样优化为使用OR运算符。当查询将点查询与OR运算符组合时，它不作为点查询处理。在

这可以在python、C和azurestorageexplorer中重现，在这些浏览器中，如果您将点查询与OR结合起来，则会比只返回一行的单独点查询慢10倍（甚至更多）。在

因此，获得具有分区键和行键的行数的最有效方法是使用TableOperation.Retrieve（在C中）进行单独的异步查询。使用TableQuery是非常低效的，并且不会产生任何接近Azure表存储的性能可伸缩性目标的结果。可伸缩性目标例如：“单个表分区（1kib实体）的目标吞吐量高达每秒2000个实体”。在这里，我甚至不能以每秒5行的速度提供服务，尽管所有的行都在不同的分区中。在

在任何文档或性能优化指南中，查询性能的这一限制都没有明确说明，但可以从Azure storage performance checklist中的这些行中了解到：

Querying
This section describes proven practices for querying the table service.
Query scope
There are several ways to specify the range of entities to query. The following is a discussion of the uses of each.
In general, avoid scans (queries larger than a single entity), but if you must scan, try to organize your data so that your scans retrieve the data you need without scanning or returning significant amounts of entities you don't need.
Point queries
A point query retrieves exactly one entity. It does this by specifying both the partition key and row key of the entity to retrieve. These queries are efficient, and you should use them wherever possible.
Partition queries
A partition query is a query that retrieves a set of data that shares a common partition key. Typically, the query specifies a range of row key values or a range of values for some entity property in addition to a partition key. These are less efficient than point queries, and should be used sparingly.
Table queries
A table query is a query that retrieves a set of entities that does not share a common partition key. These queries are not efficient and you should avoid them if possible.

所以“一个点查询只检索一个实体”和“尽可能使用点查询”。因为我已经将数据拆分为分区，所以它可能被当作表查询来处理：“表查询是检索不共享公共分区键的一组实体的查询”。尽管查询组合了一组点查询，因为它列出了所有预期实体的分区键和行键。但是，由于组合查询不是只检索一个查询，所以不能期望它作为点查询（或点查询集）执行。在

相关问题更多 >

编程相关推荐

热门问题

热门文章