如何在Python中根据空白分隔符将文本文件分割为多个列表？

3条回答

网友

1楼 · 编辑于 2024-05-15 02:05:12

首先split将双空格上的整个文本然后将每个项目传递给regex，如下所示：

>>> file = "What's did the little boy tell the game warden?  His dad was in the kitchen poaching eggs!"
>>> file = text.split('  ')
>>> file
["What's did the little boy tell the game warden?", 'His dad was in the kitchen poaching eggs!']
>>> res = []
>>> for sen in file:
...    res.append(re.findall(r'\w+', sen))
... 
>>> res
[['What', 's', 'did', 'the', 'little', 'boy', 'tell', 'the', 'game', 'warden'], ['His', 'dad', 'was', 'in', 'the', 'kitchen', 'poaching', 'eggs']]

网友

2楼 · 编辑于 2024-05-15 02:05:12

以下是合理的all RE方法：

def tokenize(document):
    with open("document.txt") as f:
        text = f.read()
    blocks = re.split(r'\s\s+', text)
    return [re.findall(r'\w+', b) for b in blocks]

网友

3楼 · 编辑于 2024-05-15 02:05:12

内置split函数允许在多个空间上拆分。在

这个：

a = "hello world.  How are you"
b = a.split('  ')
c = [ x.split(' ') for x in b ]

产量：

^{pr2}$

如果还想删除标点符号，请将正则表达式应用于“b”中的元素或第三个语句中的“x”中。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在Python中根据空白分隔符将文本文件分割为多个列表？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >