如何使用混合零件分割线

2024-06-16 11:22:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个模板构建的文本行:

  1. 这首歌的名字(歌手)(歌年)
  2. 这首歌的名字(当时是歌手(歌年))

模板之间的区别在于歌手括号内或括号外的歌曲年份

我想把每一行分成三部分:

  1. 歌名
  2. 歌手
  3. 宋年

小示例

Ring Ring (ABBA (1973))
Waterloo (ABBA) (1974)
If I Don’t Write This Song Someone I Love Will Die (Hello Saferide) (2005)
My Best Friend (Hello Saferide (2005))

我尝试将RexExp与逻辑OR一起使用

import re

the_lines = ("Ring Ring (ABBA (1973))",
             "Waterloo (ABBA) (1974)",
             "If I Don’t Write This Song Someone I Love Will Die (Hello Saferide) (2005)",
             "My Best Friend (Hello Saferide (2005))",
             )
pattern = r"((.*) \((.*)\) \((\d*)\))|((.*) \((.*\((\d*)\))\))"

for line in the_lines:
    title, artist, year = re.split(pattern, line)
    print(title, artist, year)

但是这个结果是redundant,它得到8个组


Tags: 模板helloifsongthis名字write括号
3条回答

纯Python:

text = """Ring Ring (ABBA (1973))
Waterloo (ABBA) (1974)
If I Don’t Write This Song Someone I Love Will Die (Hello Saferide) (2005)
My Best Friend (Hello Saferide (2005))"""
text = text.split("\n")
songs = {}
for song in text:
    name = song.split("(")[0]
    band = song.split("(")[1].split(" ")[0]
    year = song.split("(")[2]
    band = band.replace(")","")
    year = year.replace(")","")
    print("band",band,"year",year,"song",name)
    songs[name] = {"year":year,"band":band}
print(songs)

您的规范并不真正需要REs,对于每一行,看起来您可以使用artist_song_year = line.split("("),然后使用额外的清理步骤,如artist_song_year = [item.strip(")").strip(")") for item in artist_song_year]

你可以试试这个

import re

s = '''
Ring Ring (ABBA (1973))
Waterloo (ABBA) (1974)
If I Don’t Write This Song Someone I Love Will Die (Hello Saferide) (2005)
My Best Friend (Hello Saferide (2005))
'''


f = re.findall(r"(.*)\s\((.*?)\)?\s\((\d{4})\)",s)
print(*f,sep='\n')
('Ring Ring', 'ABBA', '1973')
('Waterloo', 'ABBA', '1974')
('If I Don’t Write This Song Someone I Love Will Die', 'Hello Saferide', '2005')
('My Best Friend', 'Hello Saferide', '2005')

相关问题 更多 >