解析文本文件而不拆分多词名称

2条回答

网友

1楼 · 编辑于 2024-05-13 23:59:58

正则表达式是非常强大和有用的，但它需要花很多时间来适应使用他们与一些权威。我建议您坚持使用split，这里是split的帮助信息，它描述了如何使用maxsplit值来限制split的数量。你知道吗

Help on built-in function split:

split(...)
S.split([sep [,maxsplit]]) -> list of strings

Return a list of the words in the string S, using sep as the
delimiter string.  If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator and empty strings are removed
from the result.

所以对于你的代码，假设你有一些行要分割

mytest = dict()
for each_line in data:
    number, name = line.split(None,1)
    mytest[number] = name

会有这样的回报吗

mytest {'27': 'anjou pear', '7645': 'langsat', 'number': 'name', '36': 'asian pear', '14': 'apple'} to access the help suppose you have some string mystring then just type

help(mystring.split)

我第一次尝试和这次尝试的不同是因为下面的评论。在我的第一次尝试中，name值上的前导空格被保留，但是，通过使用None，在第一次拆分时所有的空格字符都被删除了，这样就可以更具体地了解您要查找的内容。你知道吗

网友

2楼 · 编辑于 2024-05-13 23:59:58

为此，可以使用re.findall()。你知道吗

input = "number    name\n14        apple\n27        anjou pear\n36        asian pear\n7645      langsat\n"
print re.findall("(\w+)\s+(.+)", input)

输出：

[('number', 'name'), ('14', 'apple'), ('27', 'anjou pear'), ('36', 'asian pear'), ('7645', 'langsat')]

相关问题更多 >

编程相关推荐

热门问题

热门文章

解析文本文件而不拆分多词名称

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >