在两个包含指定范围的2D numpy数组之间获取区间切割

1 投票

1 回答

28 浏览

数据工程师

提问于 2025-04-14 15:58

我一直在努力写一个函数，用来处理两个包含区间的numpy数组（a1和a2），这些区间的范围是从0到6000。a1和a2中的区间不能重叠，也就是说，如果某个区间在a1中出现，那么它就不能在a2中出现，反之亦然。

我需要一个叫做shuffle的函数，它可以接收一个给定的区间。这个函数的工作是把在a1中的部分移动到a2，而把在a2中的部分移动到a1。

最后，这两个二维数组需要被排序和压缩，确保在区间接近时，数组的大小不会不必要地增长。举个例子，如果一个数组是[[0,1],[1,3]]，那么它需要被压缩成[[0,3]]。

我尝试了不同的方法，但始终无法找到一个向量化的解决方案。不过，我已经写了一个单元测试用例，展示了输出应该是什么样子的。



import unittest
class TestCases(unittest.TestCase):


    def test_invertRange(self):
        def shuffle(interval, a1, a2):
            #implement here
            pass
            return a1, a2
        a1  = np.array([[1,300],[500,600],[5000,6000]])
        a2 = np.array([[0,1],[300,500],[600,5000]])
        a1,a2 = shuffle(np.array([[0.5,1.5]]),a1,a2)
        self.assertEqual(a1,np.array([[0.5,1],[1.5,300],[500,600],[5000,6000]]))
        self.assertEqual(a2,np.array([[0,0.5],[1,1.5],[300,500],[600,5000]]))
        suma1 = np.sum(a1[:,1] - a1[:,0])
        suma2 =np.sum(a2[:,1] - a2[:,0])
        total = suma1 + suma2
        # the sum of intervals should always equal the sum of the range 0,6000
        self.assertEqual(total, np.sum(np.linspace(0,6001)))
        a1,a2 = shuffle(np.array([[0.5,1.5]]),a1,a2)
        self.assertEqual(a1,np.array([[1,300],[500,600],[5000,6000]]))
        self.assertEqual(a2,np.array([[0,1],[300,500],[600,5000]]))
        suma1 = np.sum(a1[:,1] - a1[:,0])
        suma2 =np.sum(a2[:,1] - a2[:,0])
        total = suma1 + suma2
        # the sum of intervals should always equal the sum of the range 0,6000
        self.assertEqual(total, np.sum(np.linspace(0,6001)))

numpy unit testing 2d arrays interval manipulation array compression vectorization data sorting function design

1 个回答

我不太确定，单纯用numpy的向量化方式是否（容易）实现。这里有一个普通的Python版本：

def invert_interval(i1, i2, out_, in_):
    a1, b1 = i1
    a2, b2 = i2

    if a2 < a1 and a1 <= b2 <= b1:
        out_.append([b2, b1])
        in_.append([a1, b2])
    elif b2 > b1 and a1 <= a2 <= b1:
        out_.append([a1, a2])
        in_.append([a2, b1])
    elif a2 <= a1 and b2 >= b1:
        in_.append([a1, b1])
    else:
        out_.append([a1, b1])


def shuffle(interval, a1, a2):
    to_a2 = []
    new_a1 = []

    to_a1 = []
    new_a2 = []

    for i in a1:
        invert_interval(i, interval, new_a1, to_a2)

    for i in a2:
        invert_interval(i, interval, new_a2, to_a1)

    a1[:] = sorted(new_a1 + to_a1)
    a2[:] = sorted(new_a2 + to_a2)


a1 = [[1, 300], [500, 600], [5000, 6000]]
a2 = [[0, 1], [300, 500], [600, 5000]]

shuffle([0.5, 1.5], a1, a2)

print(a1)
print(a2)

输出结果是：

[[0.5, 1], [1.5, 300], [500, 600], [5000, 6000]]
[[0, 0.5], [1, 1.5], [300, 500], [600, 5000]]

回答于 2025-04-14 由 Python大师

分享举报

在两个包含指定范围的2D numpy数组之间获取区间切割

1 个回答

撰写回答