找到k个最接近给定数的数问题的回答

找到k个最接近给定数的数

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<h2>简短的回答</h2> <a href="https://docs.python.org/2.7/library/heapq.html#heapq.nsmallest" rel="nofollow noreferrer">heapq.nsmallest()</a>函数将灵活高效地执行此操作： <pre><code>>>> from heapq import nsmallest >>> s = [1,2,3,4,5,6,7] >>> nsmallest(3, s, key=lambda x: abs(x-6.5)) [6, 7, 5] </code></pre> 从本质上说，“给我三个输入值，它们与数字的绝对差值最小。”。 <h2>算法及其运行时间</h2> nsmallest的算法对数据进行单次传递，在任何时候都不会在内存中保留超过最佳值的值（这意味着它可以与任何输入迭代器一起工作，具有缓存效率和空间效率）。 算法只在找到新的“最佳”值时向堆中添加新值。因此，它将所做比较的次数减至最少。例如，如果要在1000000个随机输入中寻找100个最佳值，则通常会进行少于1008000个比较（比使用<a href="https://docs.python.org/2.7/library/functions.html#min" rel="nofollow noreferrer">min()</a>查找单个最佳值多出约0.8%的比较）。 min（）、nsmalest（）和sorted（）的<a href="https://docs.python.org/2.7/glossary.html#term-key-function" rel="nofollow noreferrer">key functions</a>都保证键函数在输入iterable的每个值中只调用一次。这意味着这项技术对于更复杂、更有趣的n-最近值问题（即<a href="http://en.wikipedia.org/wiki/Soundex" rel="nofollow noreferrer">sound the most alike</a>、最近<a href="http://en.wikipedia.org/wiki/HSL_and_HSV" rel="nofollow noreferrer">colors</a>、<a href="http://en.wikipedia.org/wiki/Levenshtein_distance" rel="nofollow noreferrer">smallest diffs</a>、最少的遗传突变、欧氏距离等）的例子将是有效的。 nsmallest（）和sorted（）都将返回一个按接近度排序的列表秩（关系由最先看到的值确定）。 对于那些感兴趣的人来说，有一个对期望的比较数量<a href="http://hg.python.org/cpython/file/38a325c84564/Lib/heapq.py#l40" rel="nofollow noreferrer">here</a>和<a href="http://code.activestate.com/recipes/577573-compare-algorithms-for-heapqsmallest" rel="nofollow noreferrer">here</a>的分析。快速总结： <ul> <li>随机输入的平均情况：<code>n + k * (log(k, 2) * log(n/k) + log(k, 2) + log(n/k))</code></li> <li>升序输入的最佳情况：<code>n + k * log(k, 2)</code></li> <li>降序输入的最坏情况：<code>n * log(k, 2)</code></li> </ul> <h2>优化重复查找</h2> 在评论中，@Phylliida询问如何优化具有不同起点的重复查找。关键是对数据进行预排序，然后使用<a href="https://docs.python.org/3/library/bisect.html#bisect.bisect" rel="nofollow noreferrer">bisect</a>定位小搜索段的中心： <pre><code>from bisect import bisect def k_nearest(k, center, sorted_data): 'Return *k* members of *sorted_data* nearest to *center*' i = bisect(sorted_data, center) segment = sorted_data[max(i-k, 0) : i+k] return nsmallest(k, segment, key=lambda x: abs(x - center)) </code></pre> 例如： <pre><code>>>> s.sort() >>> k_nearest(3, 6.5, s) [6, 7, 5] >>> k_nearest(3, 0.5, s) [1, 2, 3] >>> k_nearest(3, 4.5, s) [4, 5, 3] >>> k_nearest(3, 5.0, s) [5, 4, 6] </code></pre> bisect（）和nsmalest（）都利用排序后的数据。前者运行O（log2 k）时间，后者运行O（n）时间。

找到k个最接近给定数的数

1 个回答

相关Python问题