比较等长列表找出相同元素的位置信息

1 投票
4 回答
1925 浏览
提问于 2025-04-16 15:49

我想比较几个长度相同但内容不同的列表。我的脚本应该只返回那些在所有列表中都完全相同的元素的位置

比如说:

l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]

最后我得到的位置列表 p = [3,7],因为在所有列表中,位置3和位置7都有'4'和'8'这两个元素。

这些元素也可以是字符串,我只是用整数举个例子。谢谢大家的帮助!

4 个回答

0
li = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,6,5,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]

first = li[0]
r = range(len(first))
for current in li[1:]:
    r = [ i for i in r if current[i]==first[i]]

print [first[i] for i in r]

结果

[4, 8]

.

比较执行时间:

from time import clock

li = [[1,2,3,4,5,6,7,8,9,10],
      [9,8,8,4,5,6,5,8,9,13],
      [5,6,7,4,9,9,9,8,9,12],
      [0,0,1,4,7,6,3,8,9,5]]

n = 10000

te = clock()
for turn in xrange(n):
    first = li[0]
    r = range(len(first))
    for current in li[1:]:
        r = [ i for i in r if current[i]==first[i]]
    x = [first[i] for i in r]
t1 = clock()-te
print 't1 =',t1
print x


te = clock()
for turn in xrange(n):
    y = [j[0] for i, j in enumerate(zip(*li)) if all(j[0]==k for k in j[1:])] 
t2 = clock()-te
print 't2 =',t2
print y

print 't2/t1 =',t2/t1
print

结果

t1 = 0.176347273187
[4, 8, 9]
t2 = 0.579408755442
[4, 8, 9]
t2/t1 = 3.28561221827

.

li = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,2,22,26,24,25],
      [9,8,8,4,5,6,5,8,9,13,18,12,15,14,15,15,4,16,19,20,2,158,35,24,13],
      [5,6,7,4,9,9,9,8,9,12,45,12,4,19,15,20,24,18,19,20,2,58,23,24,25],
      [0,0,1,4,7,6,3,8,9,5,12,12,12,15,15,15,5,3,14,20,9,18,28,24,14]]

结果

t1 = 0.343173188632
[4, 8, 9, 12, 15, 20, 24]
t2 = 1.21259110432
[4, 8, 9, 12, 15, 20, 24]
t2/t1 = 3.53346690385
2

我喜欢eumiro的解决方案,不过我用的是一个集合(set)。

p = [i for i, j in enumerate(zip(*l)) if len(set(j)) == 1]
4
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]

p = [i for i, j in enumerate(zip(*l)) if all(j[0]==k for k in j[1:])]

# p == [3] - because of some typo in your original list, probably too many elements in the second list.

这其实是一个简洁的写法(列表推导式),比起更详细的写法要简单很多:

p = []
for i, j in enumerate(zip(*l)):
    if all(j[0]==k for k in j[1:]):
        p.append(i)

zip(*l) 这个操作会给你:

[(1, 9, 5, 0),
 (2, 8, 6, 0),
 (3, 8, 7, 1),
 (4, 4, 4, 4),
 (5, 3, 9, 7),
 (6, 4, 9, 6),
 (7, 5, 9, 3),
 (8, 7, 8, 8)]

enumerate() 会给列表中的每个元组加上编号,从0开始,依次是0, 1, 2,等等。

all(j[0]==k for k in j[1:]) 这个表达式会把元组的第一个元素和后面所有的元素进行比较,如果它们都相等,就返回 True,如果有一个不相等,就返回 False(一旦发现不同的元素,它就会立刻返回 False,所以这样会更快)。

撰写回答