在Python数据中搜索单词模式问题的回答

在Python数据中搜索单词模式

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

您可以使用<a href="http://docs.python.org/library/collections.html#collections.Counter" rel="nofollow">^{<cd1>}</a>来执行此操作： <pre><code>>>> from collections import Counter >>> a = ( ('309','308','308'), ('309','308','307'), ('308', '309','306', '304')) >>> Counter((x, y) for (x, y, *z) in a) Counter({('309', '308'): 2, ('308', '309'): 1}) >>> Counter((x, z) for (x, y, z, *w) in a) Counter({('308', '306'): 1, ('309', '308'): 1, ('309', '307'): 1}) </code></pre> 我还在这里使用扩展元组解包，这在Python3.x之前并不存在，只有当元组的长度不确定时才需要。在python 2.x中，可以改为： ^{pr2}$ 不过，我不能说这会有多有效。我不认为这应该是坏的。在 <code>Counter</code>具有类似于<code>dict</code>的语法： <pre><code>>>> count = Counter((x, y) for (x, y, *z) in a) >>> count['309', '308'] 2 </code></pre> 编辑：您提到它们的长度可能大于1，在这种情况下，您可能会遇到问题，因为如果它们比要求的长度短，它们将无法解包。解决方案是将生成器表达式更改为忽略任何非必需格式的表达式： <pre><code>Counter((item[0], item[1]) for item in a if len(item) >= 2) </code></pre> 例如： <pre><code>>>> a = ( ('309',), ('309','308','308'), ('309','308','307'), ('308', '309','306', '304')) >>> Counter((x, y) for (x, y, *z) in a) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.2/collections.py", line 460, in __init__ self.update(iterable, **kwds) File "/usr/lib/python3.2/collections.py", line 540, in update _count_elements(self, iterable) File "<stdin>", line 1, in <genexpr> ValueError: need more than 1 value to unpack >>> Counter((item[0], item[1]) for item in a) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.2/collections.py", line 460, in __init__ self.update(iterable, **kwds) File "/usr/lib/python3.2/collections.py", line 540, in update _count_elements(self, iterable) File "<stdin>", line 1, in <genexpr> IndexError: tuple index out of range >>> Counter((item[0], item[1]) for item in a if len(item) >= 2) Counter({('309', '308'): 2, ('308', '309'): 1}) </code></pre> 如果需要可变长度计数，最简单的方法是使用列表切片： <pre><code>start = 0 end = 2 Counter(item[start:end] for item in a if len(item) >= start+end) </code></pre> 当然，这只适用于连续运行，如果要单独拾取列，则必须多做一些工作： <pre><code>def pick(seq, indices): return tuple([seq[i] for i in indices]) columns = [1, 3] maximum = max(columns) Counter(pick(item, columns) for item in a if len(item) > maximum) </code></pre>

在Python数据中搜索单词模式

1 个回答

相关Python问题