查找lis中每个元素中某个字符的数量

网友

1楼 · 编辑于 2024-04-26 04:12:38

声明“whitespace”，通常包括这些字符'\t\n\x0b\x0c\r '，再加上任何unicode字符，例如u'\u3000'（表意字符空格）。你知道吗

regex解决方案是更好的解决方案之一，因为它很容易支持除通常的ascii代码点之外的任何unicode空白代码点。只需使用^{}并设置^{}标志：

import re

def count_whitespace(s):
    return len(re.findall(r'\s', s, re.UNICODE))

l = ['this is a sentence',
     'this is one more sentence',
     '',
     u'\u3000\u2029    abcd\t\tefghi\0xb  \n\r\nj k  l\tm    \n\n',
     'nowhitespaceinthisstring']

for s in l:
    print count_whitespace(s)

输出

一种简单的、非正则表达式的方法是使用str.split()，它可以自然地拆分任何空格字符，是从字符串中删除所有空格的有效方法。这也适用于unicode空白字符：

def count_whitespace(s):
    return len(s) - len(''.join(s.split()))

for s in l:
    print count_whitespace(s)

输出

最后，选出空格字符最多的句子：

>>> max((count_whitespace(s), s) for s in l)[1]
u'\u3000\u2029    abcd\t\tefghi\x00xb  \n\r\nj k  l\tm    \n\n'

网友

2楼 · 编辑于 2024-04-26 04:12:38

使用^{}进行简单的列表理解

>>> lst = ['this is a sentence', 'this is one more sentence']
>>> [i.count(' ') for i in lst]
[3, 4]

其他方法包括使用^{}

>>> map(lambda x:x.count(' '),lst)
[3, 4]

如果您想要一个可调用的函数（正如您所提到的，它是一个遍历列表的函数），那么它可以实现为

>>> def countspace(x):
...     return x.count(' ')
...

并作为

>>> for i in lst:
...     print countspace(i)
... 
3
4

这可以用regex来解决，regex使用^{} module，如下所述Grijesh

>>> import re
>>> [len(re.findall(r"\s", i)) for i in lst]
[3, 4]

后期编辑

正如您所说的，您还需要找到max元素，您可以这样做

>>> vals = [i.count(' ') for i in lst] 
>>> lst[vals.index(max(vals))]
'this is one more sentence'

这可以通过使用

>>> def getmax(lst):
...     vals = [i.count(' ') for i in lst]
...     maxel = lst[vals.index(max(vals))]
...     return (vals,maxel)

把它当作

>>> getmax(lst)
([3, 4], 'this is one more sentence')

评论后编辑

>>> s = 'this is a sentence. this is one more sentence'
>>> lst = s.split('. ')
>>> [i.count(' ') for i in lst]
[3, 4]

网友

3楼 · 编辑于 2024-04-26 04:12:38

你可以用^{}。我不知道它是否比.count()费时

from collections import Counter
lst = ['this is a sentence', 'this is one more sentence']
>>>[Counter(i)[' '] for i in lst]
[3, 4]

相关问题更多 >

编程相关推荐

热门问题

热门文章