Python字典问题:按元组键中的元素分组

2024-03-28 13:28:19 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我有一个类似这样的字典,有4个元素元组作为键,列表列表作为相应的值。(yay索引)

{('A002', 'R051', '02-00-00', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                                  750],
                                                 [datetime.datetime(2015, 6, 21, 0, 0),
                                                  576],
                                                 [datetime.datetime(2015, 6, 22, 0, 0),
                                                  1486],
                                                 [datetime.datetime(2015, 6, 23, 0, 0),
                                                  595],
                                                 [datetime.datetime(2015, 6, 24, 0, 0),
                                                  841],
                                                 [datetime.datetime(2015, 6, 25, 0, 0),
                                                  1072],
                                                 [datetime.datetime(2015, 6, 26, 0, 0),
                                                  1049]],
 ('A002', 'R051', '02-00-01', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                                  670],
                                                 [datetime.datetime(2015, 6, 21, 0, 0),
                                                  457],
                                                 [datetime.datetime(2015, 6, 22, 0, 0),
                                                  1189],
                                                 [datetime.datetime(2015, 6, 23, 0, 0),
                                                  505],
                                                 [datetime.datetime(2015, 6, 24, 0, 0),
                                                  665],
                                                 [datetime.datetime(2015, 6, 25, 0, 0),
                                                  354],
                                                 [datetime.datetime(2015, 6, 26, 0, 0),
                                                  651]]}

我想修改这个字典,以便合并具有相同的第一、第二和第四元组元素的所有键的值。(就像上面的两把钥匙一样)。我想将这两个键元组组合成一个键元组(这样我的组合键就是('A002', 'R051', 'LEXINGTON AVE')),并合并这些值。在python中这是可能的吗?在

例如,第一个值是[日期时间。日期时间(2015,6,20,0,0),1420]——本例中为670+750

提前谢谢。在


Tags: 元素列表datetime字典时间钥匙元组yay
3条回答

是的,从Python2.7以后使用groupby和{}是非常有可能的。在

示例代码-

>>> from itertools import groupby
>>> import datetime
>>> d = {('A002', 'R051', '02-00-00', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
...                                                   750],
...                                                  [datetime.datetime(2015, 6, 21, 0, 0),
...                                                   576],
...                                                  [datetime.datetime(2015, 6, 22, 0, 0),
...                                                   1486],
...                                                  [datetime.datetime(2015, 6, 23, 0, 0),
...                                                   595],
...                                                  [datetime.datetime(2015, 6, 24, 0, 0),
...                                                   841],
...                                                  [datetime.datetime(2015, 6, 25, 0, 0),
...                                                   1072],
...                                                  [datetime.datetime(2015, 6, 26, 0, 0),
...                                                   1049]],
...  ('A002', 'R051', '02-00-01', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
...                                                   670],
...                                                  [datetime.datetime(2015, 6, 21, 0, 0),
...                                                   457],
...                                                  [datetime.datetime(2015, 6, 22, 0, 0),
...                                                   1189],
...                                                  [datetime.datetime(2015, 6, 23, 0, 0),
...                                                   505],
...                                                  [datetime.datetime(2015, 6, 24, 0, 0),
...                                                   665],
...                                                  [datetime.datetime(2015, 6, 25, 0, 0),
...                                                   354],
...                                                  [datetime.datetime(2015, 6, 26, 0, 0),
...                                                   651]]}
>>>
>>> newd = {(x[0],x[1],x[2]):[z for a in y for z in a[1]] for x, y in groupby(d.items(),key= lambda x: (x[0][0],x[0][1],x[0][3]))}
>>> newd
{('A002', 'R051', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0), 750], [datetime.datetime(2015, 6, 21, 0, 0), 576], [datetime.datetime(2015, 6, 22, 0, 0), 1486], [datetime.datetime(2015, 6, 23, 0, 0), 595], [datetime.datetime(2015, 6, 24, 0, 0), 841], [datetime.datetime(2015, 6, 25, 0, 0), 1072], [datetime.datetime(2015, 6, 26, 0, 0), 1049], [datetime.datetime(2015, 6, 20, 0, 0), 670],
[datetime.datetime(2015, 6, 21, 0, 0), 457], [datetime.datetime(2015, 6, 22, 0, 0), 1189], [datetime.datetime(2015, 6, 23, 0, 0), 505], [datetime.datetime(2015, 6, 24, 0, 0), 665], [datetime.datetime(2015, 6, 25, 0, 0), 354], [datetime.datetime(2015, 6, 26, 0, 0), 651]]}

我在你的字典里增加了一个关键字,只是为了使解决办法更清楚一点。这是我的意见。在

t = {('A002', 'R051', '02-00-00', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                                      750],
                                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                                      576],
                                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                                      1486],
                                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                                      595],
                                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                                      841],
                                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                                      1072],
                                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                                      1049]],
     ('A002', 'R051', '02-00-01', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                                      670],
                                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                                      457],
                                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                                      1189],
                                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                                      505],
                                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                                      665],
                                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                                      354],
                                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                                      651]],
     ('A002', 'R051', '02-00-01', 'LEXINGTON LANE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                                      670],
                                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                                      457],
                                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                                      1189],
                                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                                      505],
                                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                                      665],
                                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                                      354],
                                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                                      651]]}

现在,你可以这样做了。在

^{pr2}$

这将对字典的键进行排序并返回对列表。每对中的第一项将是新的唯一键(3元组),第二项将是一个迭代器,它为您提供适合此“组”的所有原始键。现在你可以这样“压缩”字典了

^{3}$

这基本上是从组列表中提取每一对。对于每一对,它使用第一个元素作为键(k1),并使用sum将{}中的所有条目组合到一个列表中,这些条目具有映射到k1的键。这就是t[k2] for k2 in vsum只是将所有这些合并到一个列表中。在

这是结果。在

{('A002', 'R051', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                      750],
                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                      576],
                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                      1486],
                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                      595],
                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                      841],
                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                      1072],
                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                      1049],
                                     [datetime.datetime(2015, 6, 20, 0, 0),
                                      670],
                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                      457],
                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                      1189],
                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                      505],
                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                      665],
                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                      354],
                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                      651]],
 ('A002', 'R051', 'LEXINGTON LANE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                       670],
                                      [datetime.datetime(2015, 6, 21, 0, 0),
                                       457],
                                      [datetime.datetime(2015, 6, 22, 0, 0),
                                       1189],
                                      [datetime.datetime(2015, 6, 23, 0, 0),
                                       505],
                                      [datetime.datetime(2015, 6, 24, 0, 0),
                                       665],
                                      [datetime.datetime(2015, 6, 25, 0, 0),
                                       354],
                                      [datetime.datetime(2015, 6, 26, 0, 0),
                                       651]]}

现在,我们需要使用日期组合这些值。我们可以写一个简单的函数combine,如下所示

def combine(l):
    t = itertools.groupby(sorted(l, key=lambda v:v[0]), lambda v:v[0])
    return [[k,sum(m[1] for m in v)] for k,v in t]

这在一个包含2个元组的列表中重复上述过程。它按第一个元素分组,然后将子组的第二个元素求和到一个列表中。在

最后,为了得到最终列表,您可以简单地将combine映射到compressed字典的所有值

final = {k:combine(v) for k,v in compressed.iteritems()}

这是结果

pprint.pprint(final)

{('A002', 'R051', 'LEXINGTON AVE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                      1420],
                                     [datetime.datetime(2015, 6, 21, 0, 0),
                                      1033],
                                     [datetime.datetime(2015, 6, 22, 0, 0),
                                      2675],
                                     [datetime.datetime(2015, 6, 23, 0, 0),
                                      1100],
                                     [datetime.datetime(2015, 6, 24, 0, 0),
                                      1506],
                                     [datetime.datetime(2015, 6, 25, 0, 0),
                                      1426],
                                     [datetime.datetime(2015, 6, 26, 0, 0),
                                      1700]],
 ('A002', 'R051', 'LEXINGTON LANE'): [[datetime.datetime(2015, 6, 20, 0, 0),
                                       670],
                                      [datetime.datetime(2015, 6, 21, 0, 0),
                                       457],
                                      [datetime.datetime(2015, 6, 22, 0, 0),
                                       1189],
                                      [datetime.datetime(2015, 6, 23, 0, 0),
                                       505],
                                      [datetime.datetime(2015, 6, 24, 0, 0),
                                       665],
                                      [datetime.datetime(2015, 6, 25, 0, 0),
                                       354],
                                      [datetime.datetime(2015, 6, 26, 0, 0),
                                       651]]}

虽然我很喜欢itertools简洁明了,但非琐碎的表达通常都能摆脱我有限的大脑的限制。我经常把事情分解成这样的多个表达式,这样更容易阅读、理解和调试。在

所以,最后,您可以通过下面的代码来完成整个任务。在

^{8}$

从效率的角度看,我不喜欢这样的解决方案。它对键和值进行多次迭代。也许您可以在获得各种元素时将它们保存在更合适的数据结构中。e、 日期时间对象和值的列表可以是一个collections.Counter,键是日期时间,值是数字。在

是的,继续编一本字典吧。假设上面的数据存储在data中,我们将创建一个名为short_data的字典:

short_data = {}
for key, value in data.items():
    short_key = (key[0], key[1], key[3])
    if short_key in short_data:
        short_data[short_key].extend(value)
    else:
        short_data[short_key] = value

或者,如果您不介意使用defaultdict,您可以将其缩短:

^{pr2}$

如果您想通过相加来组合这些值,我建议使用Counter

^{3}$

相关问题 更多 >