外积作为字符串?
我想做以下事情。一个数组 [a,b; c,d] 和它自己进行外积运算,可以表示成一个 4x4 的数组,里面的内容是长度为 2 的“字符串”。比如在这个 4x4 矩阵的左上角,值会是 aa、ab、ac、ad。请问在 numpy/python 或者 matlab 中,生成这些字符串的最佳方法是什么?
这是一个单一外积的例子。我的目标是处理 k 次连续的外积,也就是说,这个 4x4 矩阵可以再次和 [a,b; c,d] 相乘,以此类推。
5 个回答
0
你是不是想要两个字符数组的克罗内克积?
这里有个简单的调整版,使用了 np.kron
(来自 numpy/lib/shape_base.py):
def outer(a,b):
# custom 'outer' for this issue
# a,b must be np.char.array for '+' to be defined
return a.ravel()[:, np.newaxis]+b.ravel()[np.newaxis,:]
def kron(a,b):
# assume a,b are 2d char array
# functionally same as np.kron, but using custom outer()
result = outer(a, b).reshape(a.shape+b.shape)
result = np.hstack(np.hstack(result))
result = np.char.array(result)
return result
A = np.char.array(list('abcd')).reshape(2,2)
这个代码会产生:
A =>
[['a' 'b']
['c' 'd']]
outer(A,A) =>
[['aa' 'ab' 'ac' 'ad']
['ba' 'bb' 'bc' 'bd']
['ca' 'cb' 'cc' 'cd']
['da' 'db' 'dc' 'dd']]
kron(A,A) =>
[['aa' 'ab' 'ba' 'bb']
['ac' 'ad' 'bc' 'bd']
['ca' 'cb' 'da' 'db']
['cc' 'cd' 'dc' 'dd']]
kron
会通过把外部元素重新排列成 (2,2,2,2)
的形状,然后在 axis=1
这个方向上拼接两次。
kron(kron(A,A),A) =>
[['aaa' 'aab' 'aba' 'abb' 'baa' 'bab' 'bba' 'bbb']
['aac' 'aad' 'abc' 'abd' 'bac' 'bad' 'bbc' 'bbd']
['aca' 'acb' 'ada' 'adb' 'bca' 'bcb' 'bda' 'bdb']
['acc' 'acd' 'adc' 'add' 'bcc' 'bcd' 'bdc' 'bdd']
['caa' 'cab' 'cba' 'cbb' 'daa' 'dab' 'dba' 'dbb']
['cac' 'cad' 'cbc' 'cbd' 'dac' 'dad' 'dbc' 'dbd']
['cca' 'ccb' 'cda' 'cdb' 'dca' 'dcb' 'dda' 'ddb']
['ccc' 'ccd' 'cdc' 'cdd' 'dcc' 'dcd' 'ddc' 'ddd']]
kron(kron(kron(A,A),A),A) =>
# (16,16)
[['aaaa' 'aaab' 'aaba' 'aabb'...]
['aaac' 'aaad' 'aabc' 'aabd'...]
['aaca' 'aacb' 'aada' 'aadb'...]
['aacc' 'aacd' 'aadc' 'aadd'...]
...]
0
接着Jose Varz的回答,我们继续讨论:
def foo(A,B):
flatA [x for row in A for x in row],
flatB = [x for row in B for x in row]
outer = [[y+x for x in flatA] for y in flatB]
return outer
In [265]: foo(A,A)
Out[265]:
[['aa', 'ab', 'ac', 'ad'],
['ba', 'bb', 'bc', 'bd'],
['ca', 'cb', 'cc', 'cd'],
['da', 'db', 'dc', 'dd']]
In [268]: A3=np.array(foo(foo(A,A),A))
In [269]: A3
Out[269]:
array([['aaa', 'aab', 'aac', 'aad', 'aba', 'abb', 'abc', 'abd', 'aca',
'acb', 'acc', 'acd', 'ada', 'adb', 'adc', 'add'],
['baa', 'bab', 'bac', 'bad', 'bba', 'bbb', 'bbc', 'bbd', 'bca',
'bcb', 'bcc', 'bcd', 'bda', 'bdb', 'bdc', 'bdd'],
['caa', 'cab', 'cac', 'cad', 'cba', 'cbb', 'cbc', 'cbd', 'cca',
'ccb', 'ccc', 'ccd', 'cda', 'cdb', 'cdc', 'cdd'],
['daa', 'dab', 'dac', 'dad', 'dba', 'dbb', 'dbc', 'dbd', 'dca',
'dcb', 'dcc', 'dcd', 'dda', 'ddb', 'ddc', 'ddd']],
dtype='|S3')
In [270]: A3.reshape(4,4,4)
Out[270]:
array([[['aaa', 'aab', 'aac', 'aad'],
['aba', 'abb', 'abc', 'abd'],
['aca', 'acb', 'acc', 'acd'],
['ada', 'adb', 'adc', 'add']],
[['baa', 'bab', 'bac', 'bad'],
['bba', 'bbb', 'bbc', 'bbd'],
['bca', 'bcb', 'bcc', 'bcd'],
['bda', 'bdb', 'bdc', 'bdd']],
[['caa', 'cab', 'cac', 'cad'],
['cba', 'cbb', 'cbc', 'cbd'],
['cca', 'ccb', 'ccc', 'ccd'],
['cda', 'cdb', 'cdc', 'cdd']],
[['daa', 'dab', 'dac', 'dad'],
['dba', 'dbb', 'dbc', 'dbd'],
['dca', 'dcb', 'dcc', 'dcd'],
['dda', 'ddb', 'ddc', 'ddd']]],
dtype='|S3')
根据这个定义,np.array(foo(A,foo(A,A))).reshape(4,4,4)
会生成相同的数组。
In [285]: A3.reshape(8,8)
Out[285]:
array([['aaa', 'aab', 'aac', 'aad', 'aba', 'abb', 'abc', 'abd'],
['aca', 'acb', 'acc', 'acd', 'ada', 'adb', 'adc', 'add'],
['baa', 'bab', 'bac', 'bad', 'bba', 'bbb', 'bbc', 'bbd'],
['bca', 'bcb', 'bcc', 'bcd', 'bda', 'bdb', 'bdc', 'bdd'],
['caa', 'cab', 'cac', 'cad', 'cba', 'cbb', 'cbc', 'cbd'],
['cca', 'ccb', 'ccc', 'ccd', 'cda', 'cdb', 'cdc', 'cdd'],
['daa', 'dab', 'dac', 'dad', 'dba', 'dbb', 'dbc', 'dbd'],
['dca', 'dcb', 'dcc', 'dcd', 'dda', 'ddb', 'ddc', 'ddd']],
dtype='|S3')
1
你可以在Python中使用列表推导式:
array = [['a', 'b'], ['c', 'd']] flatarray = [ x for row in array for x in row] outerproduct = [[y+x for x in flatarray] for y in flatarray] Output: [['aa', 'ab', 'ac', 'ad'], ['ba', 'bb', 'bc', 'bd'], ['ca', 'cb', 'cc', 'cd'], ['da', 'db', 'dc', 'dd']]
2
你可以用一种有趣的方式结合 itertools
和 numpy
来实现:
>>> from itertools import product
>>> s = 'abcd' # s = ['a', 'b', 'c', 'd'] works the same
>>> np.fromiter((a+b for a, b in product(s, s)), dtype='S2',
count=len(s)*len(s)).reshape(len(s), len(s))
array([['aa', 'ab', 'ac', 'ad'],
['ba', 'bb', 'bc', 'bd'],
['ca', 'cb', 'cc', 'cd'],
['da', 'db', 'dc', 'dd']],
dtype='|S2')
如果不想用 numpy
,你可以稍微发挥一下创意,单独使用 itertools
也能做到:
>>> from itertools import product, islice
>>> it = (a+b for a, b in product(s, s))
>>> [list(islice(it, len(s))) for j in xrange(len(s))]
[['aa', 'ab', 'ac', 'ad'],
['ba', 'bb', 'bc', 'bd'],
['ca', 'cb', 'cc', 'cd'],
['da', 'db', 'dc', 'dd']]
5
你可以用一种更简单的方法来得到@Jaime的结果,使用 np.char.array()
:
a = np.char.array(list('abcd'))
print(a[:,None]+a)
这样就能得到:
chararray([['aa', 'ab', 'ac', 'ad'],
['ba', 'bb', 'bc', 'bd'],
['ca', 'cb', 'cc', 'cd'],
['da', 'db', 'dc', 'dd']],
dtype='|S2')