迭代时zip和izip之间有什么功能上的区别吗?

2024-04-20 06:36:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有如下代码:

for v1, v2 in zip(iter1, iter2):
   print len(v1) # prints 0

但当我把拉链换成itertools.izip文件,打印1

^{pr2}$

其他代码都是一样的。我只是用izip替换了zip,它就起作用了。izip的输出是正确的。在

编辑:添加整个代码:

#!/bin/python

"""
How to use:
>>> from color_assign import Bag, assign_colors
>>> from pprint import pprint
>>> old_topics = set([
... Bag(name='T1', group=0, color=1, count=16000),
... Bag(name='T2', group=0, color=1, count=16000),
... Bag(name='T3', group=1, color=2, count=16000),
... Bag(name='T4', group=2, color=3, count=16000),
... ])
>>> new_topics = set([
... Bag(name='T1', group=0, color=None, count=16000),
... Bag(name='T2', group=4, color=None, count=16000),
... Bag(name='T3', group=1, color=None, count=16000),
... Bag(name='T4', group=1, color=None, count=16000),
... ])
>>> color_ranges = [ [1,10] ]
>>> assign_colors(old_topics, new_topics, color_ranges)
>>> pprint(sorted(new_topics, key=attrgetter('name')))
[Bag(name=T1, group=0, color=1, count=16000),
 Bag(name=T2, group=4, color=3, count=16000),
 Bag(name=T3, group=1, color=2, count=16000),
 Bag(name=T4, group=1, color=2, count=16000)]
>>> 
"""

from itertools import groupby, izip
from operator import attrgetter

class Bag:
  def __init__(self, name, group, color=None, count=None):
    self.name  = name 
    self.group = group
    self.color    = color   
    self.count  = count 
  def __repr__(self):
    return "Bag(name={self.name}, group={self.group}, color={self.color}, count={self.count})".format(self=self)
  def __key(self):
    return self.name
  def __hash__(self):
    return hash(self.__key())
  def __eq__(self, other):
    return type(self) is type(other) and self.__key() == other.__key()

def color_range_gen(color_ranges, used_colors):
  color_ranges = sorted(color_ranges)
  color_iter = iter(sorted(used_colors))
  next_used = next(color_iter, None)
  for start_color, end_color in color_ranges:
    cur_color = start_color
    end_color = end_color
    while cur_color <= end_color:
      if cur_color == next_used:
        next_used = next(color_iter, None)
      else:
        yield cur_color
      cur_color = cur_color + 1


def assign_colors(old_topics, new_topics, color_ranges):
  old_topics -= (old_topics-new_topics) #Remove topics from old_topics which are no longer present in new_topics
  used_colors = set()

  def group_topics(topics):
    by_group = attrgetter('group')
    for _, tgrp in groupby(sorted(topics, key=by_group), by_group):
      yield tgrp

  for topic_group in group_topics(old_topics):
    oldtset = frozenset(topic_group)
    peek = next(iter(oldtset))
    try:
      new_group = next(topic.group for topic in new_topics if topic.name == peek.name and not topic.color)
    except StopIteration:
      continue
    newtset = frozenset(topic for topic in new_topics if topic.group == new_group)
    if oldtset <= newtset:
      for topic in newtset:
        topic.color = peek.color
      used_colors.add(peek.color)

  free_colors = color_range_gen(color_ranges, used_colors)
  unassigned_topics = (t for t in new_topics if not t.color)
  for tset, color in zip(group_topics(unassigned_topics), free_colors):
    for topic in tset:
      topic.color = color

if __name__ == '__main__':
  import doctest
  doctest.testmod()

用法:

my_host:my_dir$ /tmp/color_assign.py
**********************************************************************
File "/tmp/color_assign.py", line 21, in __main__
Failed example:
    pprint(sorted(new_topics, key=attrgetter('name')))
Expected:
    [Bag(name=T1, group=0, color=1, count=16000),
     Bag(name=T2, group=4, color=3, count=16000),
     Bag(name=T3, group=1, color=2, count=16000),
     Bag(name=T4, group=1, color=2, count=16000)]
Got:
    [Bag(name=T1, group=0, color=None, count=16000),
     Bag(name=T2, group=4, color=3, count=16000),
     Bag(name=T3, group=1, color=2, count=16000),
     Bag(name=T4, group=1, color=2, count=16000)]
**********************************************************************
1 items had failures:
   1 of   7 in __main__
***Test Failed*** 1 failures.
my_host:my_dir$ sed -i 's/zip(/izip(/g' /tmp/color_assign.py
my_host:my_dir$ /tmp/color_assign.py
my_host:my_dir$

更新: 问题在于使用zip时,groupby使迭代器失效


Tags: nameinselfnonenewfortopiccount
2条回答

你遇到的问题是由两个因素共同造成的。{{cd2>只需要立即获取底层项目。第二,当groupby对象处于高级状态时,the previous iterators are no longer valid

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:

作为一个简单的修复方法,您可以更改group_topics在其组上调用list,然后再生成它们。在

是的,他们的产量是一样的。唯一的区别是zip在内存中创建一个列表,而izip返回迭代器。在

>>> from itertools import izip

>>> zip(range(5), 'abcde')
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e')]

>>> it = izip(range(5), 'abcde')
>>> it
<itertools.izip object at 0xa660fcc>
>>> next(it)
(0, 'a')
>>> next(it)
(1, 'b')

请注意,izip在Python3中已被删除,zip在那里返回一个iterator。在

相关问题 更多 >