外积作为字符串?

2 投票
5 回答
1537 浏览
提问于 2025-04-21 09:48

我想做以下事情。一个数组 [a,b; c,d] 和它自己进行外积运算,可以表示成一个 4x4 的数组,里面的内容是长度为 2 的“字符串”。比如在这个 4x4 矩阵的左上角,值会是 aa、ab、ac、ad。请问在 numpy/python 或者 matlab 中,生成这些字符串的最佳方法是什么?

这是一个单一外积的例子。我的目标是处理 k 次连续的外积,也就是说,这个 4x4 矩阵可以再次和 [a,b; c,d] 相乘,以此类推。

5 个回答

0

你是不是想要两个字符数组的克罗内克积?

这里有个简单的调整版,使用了 np.kron(来自 numpy/lib/shape_base.py):

def outer(a,b):
    # custom 'outer' for this issue
    # a,b must be np.char.array for '+' to be defined
    return a.ravel()[:, np.newaxis]+b.ravel()[np.newaxis,:]

def kron(a,b):
    # assume a,b are 2d char array
    # functionally same as np.kron, but using custom outer()
    result = outer(a, b).reshape(a.shape+b.shape)
    result = np.hstack(np.hstack(result))
    result = np.char.array(result)
    return result

A  = np.char.array(list('abcd')).reshape(2,2)

这个代码会产生:

A =>

[['a' 'b']
 ['c' 'd']]

outer(A,A) =>

[['aa' 'ab' 'ac' 'ad']
 ['ba' 'bb' 'bc' 'bd']
 ['ca' 'cb' 'cc' 'cd']
 ['da' 'db' 'dc' 'dd']]

kron(A,A) =>

[['aa' 'ab' 'ba' 'bb']
 ['ac' 'ad' 'bc' 'bd']
 ['ca' 'cb' 'da' 'db']
 ['cc' 'cd' 'dc' 'dd']]

kron 会通过把外部元素重新排列成 (2,2,2,2) 的形状,然后在 axis=1 这个方向上拼接两次。

kron(kron(A,A),A) =>

[['aaa' 'aab' 'aba' 'abb' 'baa' 'bab' 'bba' 'bbb']
 ['aac' 'aad' 'abc' 'abd' 'bac' 'bad' 'bbc' 'bbd']
 ['aca' 'acb' 'ada' 'adb' 'bca' 'bcb' 'bda' 'bdb']
 ['acc' 'acd' 'adc' 'add' 'bcc' 'bcd' 'bdc' 'bdd']
 ['caa' 'cab' 'cba' 'cbb' 'daa' 'dab' 'dba' 'dbb']
 ['cac' 'cad' 'cbc' 'cbd' 'dac' 'dad' 'dbc' 'dbd']
 ['cca' 'ccb' 'cda' 'cdb' 'dca' 'dcb' 'dda' 'ddb']
 ['ccc' 'ccd' 'cdc' 'cdd' 'dcc' 'dcd' 'ddc' 'ddd']]

kron(kron(kron(A,A),A),A) =>

# (16,16)
[['aaaa' 'aaab' 'aaba' 'aabb'...]
 ['aaac' 'aaad' 'aabc' 'aabd'...]
 ['aaca' 'aacb' 'aada' 'aadb'...]
 ['aacc' 'aacd' 'aadc' 'aadd'...]
 ...]
0

接着Jose Varz的回答,我们继续讨论:

def foo(A,B):
    flatA [x for row in A for x in row],
    flatB = [x for row in B for x in row]
    outer = [[y+x for x in flatA] for y in flatB]
    return outer

In [265]: foo(A,A)
Out[265]: 
[['aa', 'ab', 'ac', 'ad'],
 ['ba', 'bb', 'bc', 'bd'],
 ['ca', 'cb', 'cc', 'cd'],
 ['da', 'db', 'dc', 'dd']]

In [268]: A3=np.array(foo(foo(A,A),A))
In [269]: A3
Out[269]: 
array([['aaa', 'aab', 'aac', 'aad', 'aba', 'abb', 'abc', 'abd', 'aca',
        'acb', 'acc', 'acd', 'ada', 'adb', 'adc', 'add'],
       ['baa', 'bab', 'bac', 'bad', 'bba', 'bbb', 'bbc', 'bbd', 'bca',
        'bcb', 'bcc', 'bcd', 'bda', 'bdb', 'bdc', 'bdd'],
       ['caa', 'cab', 'cac', 'cad', 'cba', 'cbb', 'cbc', 'cbd', 'cca',
        'ccb', 'ccc', 'ccd', 'cda', 'cdb', 'cdc', 'cdd'],
       ['daa', 'dab', 'dac', 'dad', 'dba', 'dbb', 'dbc', 'dbd', 'dca',
        'dcb', 'dcc', 'dcd', 'dda', 'ddb', 'ddc', 'ddd']], 
      dtype='|S3')

In [270]: A3.reshape(4,4,4)
Out[270]: 
array([[['aaa', 'aab', 'aac', 'aad'],
        ['aba', 'abb', 'abc', 'abd'],
        ['aca', 'acb', 'acc', 'acd'],
        ['ada', 'adb', 'adc', 'add']],

       [['baa', 'bab', 'bac', 'bad'],
        ['bba', 'bbb', 'bbc', 'bbd'],
        ['bca', 'bcb', 'bcc', 'bcd'],
        ['bda', 'bdb', 'bdc', 'bdd']],

       [['caa', 'cab', 'cac', 'cad'],
        ['cba', 'cbb', 'cbc', 'cbd'],
        ['cca', 'ccb', 'ccc', 'ccd'],
        ['cda', 'cdb', 'cdc', 'cdd']],

       [['daa', 'dab', 'dac', 'dad'],
        ['dba', 'dbb', 'dbc', 'dbd'],
        ['dca', 'dcb', 'dcc', 'dcd'],
        ['dda', 'ddb', 'ddc', 'ddd']]], 
      dtype='|S3')

根据这个定义,np.array(foo(A,foo(A,A))).reshape(4,4,4) 会生成相同的数组。

In [285]: A3.reshape(8,8)
Out[285]: 
array([['aaa', 'aab', 'aac', 'aad', 'aba', 'abb', 'abc', 'abd'],
       ['aca', 'acb', 'acc', 'acd', 'ada', 'adb', 'adc', 'add'],
       ['baa', 'bab', 'bac', 'bad', 'bba', 'bbb', 'bbc', 'bbd'],
       ['bca', 'bcb', 'bcc', 'bcd', 'bda', 'bdb', 'bdc', 'bdd'],
       ['caa', 'cab', 'cac', 'cad', 'cba', 'cbb', 'cbc', 'cbd'],
       ['cca', 'ccb', 'ccc', 'ccd', 'cda', 'cdb', 'cdc', 'cdd'],
       ['daa', 'dab', 'dac', 'dad', 'dba', 'dbb', 'dbc', 'dbd'],
       ['dca', 'dcb', 'dcc', 'dcd', 'dda', 'ddb', 'ddc', 'ddd']], 
      dtype='|S3')
1

你可以在Python中使用列表推导式:

array = [['a', 'b'], ['c', 'd']]
flatarray = [ x for row in array for x in row]
outerproduct = [[y+x for x in flatarray] for y in flatarray]
Output: [['aa', 'ab', 'ac', 'ad'], ['ba', 'bb', 'bc', 'bd'], ['ca', 'cb', 'cc', 'cd'], ['da', 'db', 'dc', 'dd']]
2

你可以用一种有趣的方式结合 itertoolsnumpy 来实现:

>>> from itertools import product
>>> s = 'abcd' # s = ['a', 'b', 'c', 'd'] works the same
>>> np.fromiter((a+b for a, b in product(s, s)), dtype='S2',
                count=len(s)*len(s)).reshape(len(s), len(s))
array([['aa', 'ab', 'ac', 'ad'],
       ['ba', 'bb', 'bc', 'bd'],
       ['ca', 'cb', 'cc', 'cd'],
       ['da', 'db', 'dc', 'dd']],
      dtype='|S2')

如果不想用 numpy,你可以稍微发挥一下创意,单独使用 itertools 也能做到:

>>> from itertools import product, islice
>>> it = (a+b for a, b in product(s, s))
>>> [list(islice(it, len(s))) for j in xrange(len(s))]
[['aa', 'ab', 'ac', 'ad'],
 ['ba', 'bb', 'bc', 'bd'],
 ['ca', 'cb', 'cc', 'cd'],
 ['da', 'db', 'dc', 'dd']]
5

你可以用一种更简单的方法来得到@Jaime的结果,使用 np.char.array()

a  = np.char.array(list('abcd'))
print(a[:,None]+a)

这样就能得到:

chararray([['aa', 'ab', 'ac', 'ad'],
       ['ba', 'bb', 'bc', 'bd'],
       ['ca', 'cb', 'cc', 'cd'],
       ['da', 'db', 'dc', 'dd']],
      dtype='|S2')

撰写回答