从某种格式的字符串中提取数据

2024-04-25 14:49:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我得到了一些这种格式的字符串:

GETMOVIE#genre:Action&year:1990-2007&country:USA
GETMOVIE#genre:Animation&year:2000-2010&country:Russia
GETMOVIE#genre:X&year:Y&country:Z

我想知道如何将这些字符串中的XYZ提取到strings\list中 我试过用切片法,但很难用。一些提示


Tags: 字符串格式切片actionyearcountryliststrings
3条回答

您可以使用^{}来进行如下操作:

代码:

def process_data(some_data):
    return_data = {}
    for datum in some_data:
        main_key, values = datum.split('#')
        return_data.setdefault(main_key, []).append(dict(
            tuple(v.split(':')) for v in values.split('&')
        ))
    return return_data

测试代码:

data = [x.strip() for x in """
    GETMOVIE#genre:Action&year:1990-2007&country:USA
    GETMOVIE#genre:Animation&year:2000-2010&country:Russia
    GETMOVIE#genre:X&year:Y&country:Z
""".split('\n')[1:-1]]

print(data)
print(process_data(data))

结果:

['GETMOVIE#genre:Action&year:1990-2007&country:USA', 
 'GETMOVIE#genre:Animation&year:2000-2010&country:Russia', 
 'GETMOVIE#genre:X&year:Y&country:Z']

{'GETMOVIE': [
    {'genre': 'Action', 'year': '1990-2007', 'country': 'USA'}, 
    {'genre': 'Animation', 'year': '2000-2010', 'country': 'Russia'}, 
    {'genre': 'X', 'year': 'Y', 'country': 'Z'}
]}

为什么分裂是不可能的

这是一条很好的单行线:

s = "GETMOVIE#genre:Animation&year:2000-2010&country:Russia"
d = dict(p.split(':', 1) for p in s.partition("#")[2].split("&"))
print(d)
import re

line = 'GETMOVIE#genre:Action&year:1990-2007&country:USA'
pattern = r'^GETMOVIE#genre:(.+)&year:(.+)&country:(.+)$'
genre, year, country = re.match(pattern, line).groups()
print(genre, year, country)  # Action 1990-2007 USA

相关问题 更多 >

    热门问题