如何使用特殊字符创建字符串列表,以了解在何处进行spli

2024-05-21 08:52:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件,包含了平克·弗洛伊德所有专辑中的歌曲,看起来是这样的:

#The Piper At The Gates Of Dawn::1967
*Lucifer Sam::Syd Barrett::03:07::Lucifer Sam, Siam cat
Always sitting by your side
Always by your side
... ( The lyrics of the song )
*Matilda mother::Syd Barrett::03:07::There was a king who ruled the land
His majesty was in command
With silver eyes the scarlet eagle
... ( The lyrics of the song )
#Another album
*another song
song's lyrics

我想从中创建一个字符串列表,将唱片集(用#表示)作为一个字符串,然后将其中的所有歌曲作为另一个字符串,依此类推:

["album\n", "*song's name\nlyrics\n*song's name\nlyrics ..."]

非常感谢!:D个

编辑:所以我注意到我的解释有点笨拙,所以我会重新措辞。你知道吗

我要做的是将给定的文本转换成一个列表,其中每个相册和它的数据都在单独的变量中,所以我会有这样的结果:

["album's name, "(Everything between the album's name and the next one)", "album's name", ...] 

等等。你知道吗

专辑前面有#,我需要用它把它和歌曲分开。你知道吗

我试图为其查找每个#和之后的第一个#,以创建列表,但它已化为乌有:(

重要!清楚的解释:假设您有一个如下所示的字符串:

#Hello
Whatever
#Hello
More Whatever

我想把每一个“你好”分开,随便。所以我会有这样的想法:

["hello", "Whatever", "Hello", "Whatever]

我真的很抱歉我的解释能力不好。这是我能想到的最简单的解释方法:D


Tags: the字符串namehello列表albumsongsam
2条回答

不是超高效,但有效:

f = "filepath"

txt = "".join([line + "#" if line.startswith("#") else line for line in open(f)])
data = [x for x in txt.split("#")][1:]
data

['The Piper At The Gates Of Dawn::1967\n',
 '*Lucifer Sam::Syd Barrett::03:07::Lucifer Sam, Siam cat\nAlways sitting by your side\nAlways by your side\n... ( The lyrics of the song )\n*Matilda mother::Syd Barrett::03:07::There was a king who ruled the land\nHis majesty was in command\nWith silver eyes the scarlet eagle\n... ( The lyrics of the song )\n',
 'Another album\n',
 "*another song\nsong's lyrics\n"]

您可以使用正则表达式(re模块)来实现,考虑以下示例,假设您有文件songs.txt,如下所示:

#Song 1
First line
Second line
#Song 2
First line of second
Last line

你可以做:

import re
with open('songs.txt','r') as f:
    data = f.read()
songs = re.findall(r'(#.+?\n)([^#]+)',data)
#now songs is list of 2-tuples with song name and "song body"
songs = list(sum(songs,())) #here I am doing so called flattening
print(songs) #['#Song 1\n', 'First line\nSecond line\n', '#Song 2\n', 'First line of second\nLast line\n']

patternre.findall的第一个参数)包含两个用括号(())表示的组,第一个表示标题,第二个表示歌词。第一个组的形式必须是:#,后跟一个或多个非换行符(\n),并以换行符(\n)结尾。第二组仅表示1个或多个非#字符。你知道吗

相关问题 更多 >