如何在python中将文件转换为字典

2024-05-23 14:08:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文件,有下面的文字

1. Beatles - Revolver (1966)
2. Nirvana - Nevermind (1991)
3. Beatles - Sgt Pepper's Lonely Hearts Club Band (1967)
4. U2 - The Joshua Tree (1987)
5. Beatles - The Beatles (The White Album) (1968)
6. Beatles - Abbey Road (1969)
7. Guns N' Roses - Appetite For Destruction (1987)
8. Radiohead - Ok Computer (1997)
9. Led Zeppelin - Led Zeppelin 4 (1971)
10. U2 - Achtung Baby (1991)
11. Pink Floyd - Dark Side Of The Moon (1973)
12. Michael Jackson -Thriller (1982)
13. Rolling Stones - Exile On Main Street (1972)
14. Clash - London Calling (1979)
15. U2 - All That You Can't Leave Behind (2000)
16. Weezer - Pinkerton (1996)
17. Radiohead - The Bends (1995)
18. Smashing Pumpkins - Mellon Collie And The Infinite Sadness (1995)
19. Pearl Jam - Ten (1991)
20. Beach Boys - Pet Sounds (1966)
21. Weezer - Weezer (1994)
22. Nirvana - In Utero (1993)
23. Beatles - Rubber Soul (1965)
24. Eminem -The Eminem Show (2002)
25. R.E.M. - Automatic For The People (1992)
26. Radiohead - Kid A (2000)
27. Tool - Aenima (1996)
28. Smashing Pumpkins - Siamese Dream (1993)
29. Madonna - Ray Of Light (1998)
30. Rolling Stones - Sticky Fingers (1971)
...till line 99.

因此,我必须将




信息












信息
信息
?信息???信息???
?信息?信息

信息

那个乐队的成员。此列表的每个条目都是由两个字段组成的元组:专辑名称和发行年份。我还得去掉标点符号和括号。有人能帮忙吗?在


Tags: ofthe信息forledrollingzeppelinnirvana
3条回答

我要做的是从文件中读取每一行,将其解析为一个字符串,在每个.处拆分字符串,然后将第一个字符串设为键,将第二个字符串设置为值。E、 十:

albumDict = {}
file = open(/path/to/file, "r")
for line in file.readlines():
    splitLine = line.split(".")
    albumDict[splitLine[0]] = splitline[1]

编辑: 注意:这不会检查重复条目,也不应该在专业设置中实现。如果您想让多个用户都能使用它,请添加一个检查以确保该键不存在。在

这里有一个可能更适合您的解决方案:

import re
from collections import defaultdict

band_dict = defaultdict(list)
pattern   = re.compile(r"\d+\. (?P<band>.+?) -\s?(?P<album>.+?) \((?P<year>\d+)\)")
with open("musiclist") as f:
    for line in f:
        match = pattern.match(line)
        if match:
            groupdict = match.groupdict()
            band_dict[groupdict['band']].append((groupdict['album'], groupdict['year']))
        else:
            print "Error, no match for line %s" % line

for band in band_dict:
    print band
    for album, year in band_dict[band]:
        print "\t%s: %s" % (album, year)

使用您提供的musiclist提供的数据运行这个函数可以得到

^{pr2}$

先试试这个。这是不可能完美的,你需要从这里采取和调整它为你的需要。在

import re

my_dict = {}
for record in songs:
    year = re.findall('\(([0-9]{4})\)', record)
    band = re.findall('[0-9]+\. (.*)', l.split('-')[0])
    song = re.findall('(.*) \(', record.split('-')[1].strip())

    if song and band and year:
        if my_dict.has_key(band): #alread present, append 
            my_dict[band].append((song, year))
        else: #create new entry
            my_dict[band] = [(song, year)]

print my_dict

相关问题 更多 >