查找数组中恰好出现一次的元素

2 投票

4 回答

2629 浏览

提问于 2025-04-17 12:55

我有一个二维数组。在这个数组中，每一行都代表一个我关心的数量。我想做的是把所有出现恰好一次的行放到一个数组里，把所有出现超过一次的行放到另一个数组里。

举个例子，如果这个数组是：

a=[[1,1,1,0], [1,1,1,0], [5,1,6,0], [3,2,1,0], [4,4,1,0], [5,1,6,0]]

我想返回两个数组：

nonsingles=[[1,1,1,0], [1,1,1,0], [5,1,6,0], [5,1,6,0]]
singles= [[3,2,1,0], [4,4,1,0]]

保持顺序很重要。我写的代码是这样的：

def singles_nonsingles(array):
#returns the elements that occur only once, and the elements
#that occur more than once in the array
singles=[]
nonsingles=[]
arrayhash=map(tuple, array)

for x in arrayhash:
    if (arrayhash.count(x)==1):
        singles.append(x)

    if (arrayhash.count(x)>1):
        nonsingles.append(x)

nonsingles=array(nonsingles)
singles=array(singles)

return {'singles':singles, 'nonsingles':nonsingles}

现在，我很高兴地说这个代码可以工作，但也很不高兴地说它非常慢，因为我通常处理的数组是30000行，每行10个元素，总共是300000个元素。有没有人能给我一些加速的建议？如果这个问题很简单，我很抱歉，我刚开始学Python。另外，我在使用Python 2.7的Numpy/Scipy，如果这有帮助的话。

性能优化数据结构 numpy 二维数组数据分析数组处理元素计数行排序

4 个回答

第一个返回的是没有重复的单个数组或没有单个数组的列表，第二个则是包含重复的数组。

def comp (multi):
    from collections import defaultdict

    res = defaultdict(int)

    for vect in multi:
        res[tuple(vect)] += 1

    singles = []
    no_singles = []

    for k in res:
        if res[k] > 1:
            no_singles.append(list(k))
        elif res[k] == 1:
            singles.append(list(k))

    return singles, no_singles

def count_w_repetitions(multi):
    from collections import defaultdict

    res = defaultdict(int)

    for vect in multi:
        res[tuple(vect)] += 1

    singles = []
    no_singles = []

    for k in res:
        if res[k] == 1:
            singles.append(list(k))
        else:
            for i in xrange(res[k]):
                no_singles.append(list(k))


    return singles, no_singles

回答于 2025-04-17 由 Python大师

分享举报

我觉得你的问题在于你在一个 list 上做了 in 测试。这种方法的性能是 O(n)，也就是说，随着列表变大，速度会变得很慢。

更快的方法是先建立一个 dict，然后用这个字典来处理每一行的数据。

补充一下：代码里有一个不必要的 enumerate()，我把它去掉了。

from collections import defaultdict

def singles_nonsingles(array):
    #returns the elements that occur only once, and the elements
    #that occur more than once in the array
    singles=[]
    nonsingles=[]
    d = defaultdict(int)

    t = [tuple(row) for row in array]

    for row in t:
        d[row] += 1

    for row in t:
        if d[row] == 1:
            singles.append(row)
        else:
            nonsingles.append(row)

    return {'singles':singles, 'nonsingles':nonsingles}

这里有一个只返回唯一行的版本：

from collections import defaultdict

def singles_nonsingles(array):
    #returns the elements that occur only once, and the elements
    #that occur more than once in the array
    singles=[]
    nonsingles=[]
    d = defaultdict(int)
    already_seen = set()

    t = [tuple(row) for row in array]

    for row in t:
        d[row] += 1

    for row in t:
        if row in already_seen:
            continue
        if d[row] == 1:
            singles.append(row)
        else:
            nonsingles.append(row)
        already_seen.add(row)

    return {'singles':singles, 'nonsingles':nonsingles}


a=[[1,1,1,0], [1,1,1,0], [5,1,6,0], [3,2,1,0], [4,4,1,0], [5,1,6,0]]

x = singles_nonsingles(a)
print("Array: " + str(a))
print(x)

回答于 2025-04-17 由 Python大师

分享举报

在Python 2.7或更高版本中，你可以使用 collections.Counter 来统计某个东西出现的次数：

def unique_items(iterable):
    tuples = map(tuple, iterable)
    counts = collections.Counter(tuples)
    unique = []
    non_unique = []
    for t in tuples:
        if counts[t] == 1:
            unique.append(t)
        else:
            non_unique.append(t)
    return unique, non_unique

回答于 2025-04-17 由 Python大师

分享举报

查找数组中恰好出现一次的元素

4 个回答

撰写回答