寻找重叠向量索引的Numpy函数
看起来应该有一个numpy的函数可以用来找出两个向量的重叠部分,但我找不到。也许你们中有人知道?
这个问题用一段简单的代码来描述最清楚(见下文)。我有两组数据(x1, y1)和(x2, y2),每组的x和y都有几百个元素。我需要把它们截断,使得它们的范围相同(也就是说,x1要等于x2),y1也要对应新的x1,y2也要截断以适应新的x2。
# x1 and y1 are abscissa and ordinate from some measurement.
x1 = array([1,2,3,4,5,6,7,8,9,10])
y1 = x1**2 # I'm just making some numbers for the ordinate.
# x2 and y2 are abscissa and ordinate from a different measurement,
# but not over the same exact range.
x2 = array([5,6,7,8,9,10,11,12,13])
y2 = sqrt(x2) # And some more numbers that aren't the same.
# And I need to do some math on just the portion where the two measurements overlap.
x3 = array([5,6,7,8,9,10])
y3 = y1[4:10] + y2[:6]
# Is there a simple function that would give me these indices,
# or do I have to do loops and compare values?
print x1[4:10]
print x2[:6]
# ------------ THE FOLLOWING IS WHAT I WANT TO REPLACE -------------
# Doing loops is really clumsy...
# Check which vector starts lower.
if x1[0] <= x2[0]:
# Loop through it until you find an index that matches the start of the other.
for i in range(len(x1)):
# Here is is.
if x1[i] == x2[0]:
# Note the offsets for the new starts of both vectors.
x1off = i
x2off = 0
break
else:
for i in range(len(x2)):
if x2[i] == x1[0]:
x1off = 0
x2off = i
break
# Cutoff the beginnings of the vectors as appropriate.
x1 = x1[x1off:]
y1 = y1[x1off:]
x2 = x2[x2off:]
y2 = y2[x2off:]
# Now make the lengths of the vectors be the same.
# See which is longer.
if len(x1) > len(x2):
# Cut off the longer one to be the same length as the shorter.
x1 = x1[:len(x2)]
y1 = y1[:len(x2)]
elif len(x2) > len(x1):
x2 = x2[:len(x1)]
y2 = y2[:len(x1)]
# OK, now the domains and ranges for the two (x,y) sets are identical.
print x1, y1
print x2, y2
谢谢!
1 个回答
4
对于简单的交集操作,你可以使用 np.intersect1d
:
In [20]: x1 = array([1,2,3,4,5,6,7,8,9,10])
In [21]: x2 = array([5,6,7,8,9,10,11,12,13])
In [22]: x3 = np.intersect1d(x1, x2)
In [23]: x3
Out[23]: array([ 5, 6, 7, 8, 9, 10])
但看起来你需要的是不同的东西。正如 @JoranBeasley 在评论中提到的,你可以使用 np.in1d
,不过你需要用两次:
这里是数据:
In [57]: x1
Out[57]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [58]: y1
Out[58]: array([ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
In [59]: x2
Out[59]: array([ 5, 6, 7, 8, 9, 10, 11, 12, 13])
In [60]: y2
Out[60]:
array([ 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ,
3.16227766, 3.31662479, 3.46410162, 3.60555128])
获取 (x1, y1) 数据的子集:
In [61]: mask1 = np.in1d(x1, x2)
In [62]: xx1 = x1[mask1]
In [63]: yy1 = y1[mask1]
In [64]: xx1, yy1
Out[64]: (array([ 5, 6, 7, 8, 9, 10]), array([ 25, 36, 49, 64, 81, 100]))
获取 (x2, y2) 数据的子集。注意,这次传给 np.in1d
的参数顺序是 x2, x1
:
In [65]: mask2 = np.in1d(x2, x1)
In [66]: xx2 = x2[mask2]
In [67]: yy2 = y2[mask2]
In [68]: xx2, yy2
Out[68]:
(array([ 5, 6, 7, 8, 9, 10]),
array([ 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ,
3.16227766]))
其实我们并不需要单独形成 xx2
,因为它和 xx1
是一样的。现在我们可以对 yy1
和 yy2
进行操作。例如:
In [69]: yy1 + yy2
Out[69]:
array([ 27.23606798, 38.44948974, 51.64575131, 66.82842712,
84. , 103.16227766])