Python - 在列表中对相似值取平均并创建新的平均值列表

3 投票
4 回答
1054 浏览
提问于 2025-04-18 15:43

我有一些包含很多数字的列表,这些数字可能是小数。例如:

A = ['1', '1.01', '1.1', '2', '3', '3.2', '4', '5']

假设我想计算那些相差小于0.5的数字的平均值,并用这些平均值和未受影响的数字来创建一个新列表。

在我的例子中,数字1、1.01和1.1之间的差距都小于0.5,所以新列表会包含它们的平均值1.04。同样,数字3和3.2的平均值是3.1,也会被加入新列表。

所以最终的输出会是:

B = [1.04, 2, 3.1, 4, 5]

还有一些特殊情况,比如这个列表:

C = [1.1, 1.2, 1.3, 1.4, 1.5, 1.6]

在这些情况下会出现一些问题:我们是对前5个元素取平均,还是对后5个?如果可以的话,我更喜欢从左到右的优先顺序,也就是说先对前5个元素进行分组,然后保留第六个元素。不过,我的列表中的数据很少会出现这种情况,因为相似的值通常都很接近。其实不需要在代码中考虑这些情况,除非这是确保代码正常工作的必要条件。

那么,最有效的方法是什么呢?实际上,我打算用这个来构建不同超新星的光变曲线。如果两次观测之间的时间差小于某个值,我可以把它们视为在两者之间的平均时间进行的一次观测。

我对Python还比较陌生,至今为止我尝试解决这个问题的努力都没有成功……如果这个问题太基础,我很抱歉。

提前谢谢你们!

4 个回答

0

我觉得@thefourtheye的回答比这个更好。

A = ['1', '1.01', '1.1', '2', '3', '3.2', '4', '5']
# change str to float and sort it.
a = sorted([float(v) for v in A])

averaged_start = a[0]

averaged_dict = {}
for value in a:
    if value - averaged_start < 0.5:
        averaged_dict.setdefault(averaged_start, []).append(value)
    else:
        averaged_start = value
        averaged_dict[averaged_start] = [averaged_start]

result = [round(sum(v)/len(v), 2) for k, v in averaged_dict.items()]
print(result)

输出结果:

[1.04, 2.0, 3.1, 4.0, 5.0]
0

就在我发完问题后,有人给我回复了下面这段代码:

from collections import OrderedDict

A = [1, 1.01, 1.02, 2, 3, 4, 4,4.1]

d = OrderedDict()

for item in A:
    d.setdefault(int(item/0.25), []).append(item)

    A = [sum(item) / len(item) for item in d.itervalues()]

print A
#[1.01, 2, 3, 4.033333333333333]

到目前为止,这段代码运行得非常好,虽然我还没有测试它的每一个细节。感谢那个发布代码的人,虽然他后来把它删掉了!

4
A = [1, 1.01, 1.1, 2, 3, 3.2, 4, 5]
groups, current_group, first = [], [], A[0]
for item in A:
    # Check if this element falls under the current group
    if item - first <= 0.5:
        current_group.append(item)
    else:
        # If it doesn't, create a new group and add old to the result
        groups.append(current_group[:])
        current_group, first = [item], item
# Add the last group which was being gathered to the result
groups.append(current_group[:])
print[sum(item) / len(item) for item in groups]
# [1.0366666666666666, 2, 3.1, 4, 5]

现在,计算平均值其实很简单,就像这样

0

显然,有很多方法可以解决这个问题。我想分享一个不同的思路,使用一个叫做 grouper 的函数和标准库。我还定义了一个方便的函数 average_similar。下面是一个使用示例:

# Convert, sort and group.  Print generated groups.
A = ['1', '1.01', '1.1', '2', '3', '3.2', '4', '5']
a1 = sorted(float(f) for f in A)
g1 = grouper(a1)

print("Grouped A:", g1)
# Grouped A: [[1.0, 1.01, 1.1], [2.0], [3.0, 3.2], [4.0], [5.0]]


# Generate new list as average of each group.
g2 = (mean(g) for g in grouper(a1))
a2 = list(g2)

print("Averaged grouped A:", a2)
# Averaged grouped A: [1.0366666666666668, 2.0, 3.1, 4.0, 5.0]

print("Averaged grouped A:", average_similar(A, width=0.5))
# Averaged grouped A: [1.0366666666666668, 2.0, 3.1, 4.0, 5.0]


# Generate new list as rounded averages of each group.
g3 = (round(mean(g), 2) for g in grouper(a1))
a3 = list(g3)

print("Averaged grouped and rounded A:", a3)
# Averaged grouped and rounded A: [1.04, 2.0, 3.1, 4.0, 5.0]

print("Averaged grouped and rounded A:", average_similar(A, 0.5, 2))
# Averaged grouped and rounded A: [1.04, 2.0, 3.1, 4.0, 5.0]


# A more compact example given a list of numbers.
C = [1.1, 1.2, 1.3, 1.5, 1.6, 1.4]
# In-place sort.
C.sort() 
lc = list(round(mean(g), 2) for g in grouper(C))
print("Average C", lc)
# Average C [1.3, 1.6]
print("Average C", average_similar(C, precision=2))
# Average C [1.3, 1.6]


# Another examples as a one-liner.
D = ['1', '1.01', '1.1', '2', '3', '3.2', '4', '5', '5.1', '6', '2.5']
ld = list(round(mean(g), 2)
          for g in grouper(
                  sorted(float(f) for f in D)))
print("Average D", ld)
# Average D [1.04, 2.0, 2.5, 3.1, 4.0, 5.05, 6.0]
print("Average D", average_similar(D, width=0.5, precision=2))
# Average D [1.04, 2.0, 2.5, 3.1, 4.0, 5.05, 6.0]

这些示例使用了以下代码:

import itertools
from statistics import mean

def average_similar(iterable, width=0.5, precision=None, criteria=make_keyfcn):
    """Return a list where similar numbers have been averaged.

    Items are grouped using the supplied width and criteria and the
    result is rounded to precision if it is supplied.  Otherwise
    averages are not rounded.

    """
    lst = sorted(float(f) for f in iterable)
    g1 = (mean(g) for g in grouper(lst, criteria(width)))
    if precision is not None:
      g1 = (round(g, precision) for g in g1)
    return list(g1)

def grouper(iterable, criteria=None):
    if criteria is None:
        criteria = make_keyfcn()
    result = []
    for k, g in itertools.groupby(iterable, criteria):
        result.append(list(g))
    return result

def make_keyfcn(width=0.5):
    "Grouping critera."
    key = None
    def keyfcn(x):
        """As long as x is < key, keep returning key.

        Update when x >= key.
        """
        nonlocal key
        if key is None:  # When called the first time.
            key = x + width
        elif x >= key:
            key = x + width
        return key
    return keyfcn

撰写回答