合并二维数组
假设我有两个数组:
arrayOne = [["james", 35], ["michael", 28], ["steven", 23],
["jack", 18], ["robert", 12]]
arrayTwo = [["charles", 45], ["james", 36], ["trevor", 24],
["michael", 17], ["steven", 4]]
我想把它们合并成一个二维数组,也就是说,每个内部数组的第一个元素是名字(比如 james、charles 等)。第二个元素是它在 arrayOne
中对应的值,如果没有对应的值,就用 0 来表示。第三个元素也是类似的。顺序其实不太重要,只要数字和名字能对应上就行。换句话说,我想得到的结果大概是这样的:
arrayResult = [["james", 35, 36], ["michael", 28, 17], ["steven", 23, 4],
["jack", 18, 0], ["robert", 12, 0], ["charles", 0, 45],
["trevor", 0, 4]]
另外,我还希望如果我再提供一个数组的话,可以在这个结果数组中添加更多的“列”。
2 个回答
4
>>> dict1 = dict(arrayOne)
>>> dict2 = dict(arrayTwo)
>>> keyset = set(dict1.keys() + dict2.keys())
>>> [[key, dict1.get(key, 0), dict2.get(key, 0)] for key in keyset]
[['james', 35, 36], ['robert', 12, 0], ['charles', 0, 45],
['michael', 28, 17], ['trevor', 0, 24], ['jack', 18, 0],
['steven', 23, 4]]
如果你想添加多个列,那就有点复杂了,这时候用字典会比较好。不过,在合适的位置放0
就成了一个挑战,因为当我们往“主字典”里添加一个名字时,得确保它前面有一个合适长度的0
列表。我有点想为这个创建一个新的类,但首先,这里有一个基于函数的基本解决方案:
def add_column(masterdict, arr):
mdlen = len(masterdict[masterdict.keys()[0]])
newdict = dict(arr)
keyset = set(masterdict.keys() + newdict.keys())
for key in keyset:
if key not in masterdict:
masterdict[key] = [0] * mdlen
masterdict[key].append(newdict.get(key, 0))
arrayOne = [["james", 35],
["michael", 28],
["steven", 23],
["jack", 18],
["robert", 12]]
arrayTwo = [["charles", 45],
["james", 36],
["trevor", 24],
["michael", 17],
["steven", 4]]
arrayThree = [["olliver", 11],
["james", 39],
["john", 22],
["michael", 13],
["steven", 6]]
masterdict = dict([(i[0], [i[1]]) for i in arrayOne])
add_column(masterdict, arrayTwo)
print masterdict
add_column(masterdict, arrayThree)
print masterdict
输出:
{'james': [35, 36], 'robert': [12, 0], 'charles': [0, 45],
'michael': [28, 17], 'trevor': [0, 24], 'jack': [18, 0],
'steven': [23, 4]}
{'james': [35, 36, 39], 'robert': [12, 0, 0], 'charles': [0, 45, 0],
'michael': [28, 17, 13], 'trevor': [0, 24, 0], 'olliver': [0, 0, 11],
'jack': [18, 0, 0], 'steven': [23, 4, 6], 'john': [0, 0, 22]}
4
看起来你真正需要的是字典,而不是数组。如果你使用字典,这个问题会简单很多。把数据转换成字典也非常简单:
dictOne = dict(arrayOne)
dictTwo = dict(arrayTwo)
接下来,你可以像这样把它们组合在一起:
combined = dict()
for name in set(dictOne.keys() + dictTwo.keys()):
combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0) ]
这样做的目的是创建一个新的字典,叫做 combined
,我们会把最终的数据放在这里。然后,我们从两个原始字典中提取出一组键。使用集合的好处是可以确保我们不会重复做同样的事情。最后,我们遍历这组键,把每对值添加到 combined
字典中,并告诉 .get
方法如果没有值的话就返回 0
。如果你需要把合并后的字典再转换回数组,这也很简单:
arrayResult = []
for name in combined:
arrayResult.append([ name ] + combined[name])
假设你想在结果字典中添加另一列,你只需要把中间的代码改成这样:
combined = dict()
for name in set(dictOne.keys() + dictTwo.keys() + dictThree.keys()):
combined[name] = [ dictOne.get(name, 0), dictTwo.get(name, 0), dictThree.get(name, 0) ]
如果你想把这些逻辑封装成一个函数(我建议这样做),你可以这样写:
def combine(*args):
# Create a list of dictionaries from the arrays we passed in, since we are
# going to use dictionaries to solve the problem.
dicts = [ dict(a) for a in args ]
# Create a list of names by looping through all dictionaries, and through all
# the names in each dictionary, adding to a master list of names
names = []
for d in dicts:
for name in d.keys():
names.append(name)
# Remove duplicates in our list of names by making it a set
names = set(names)
# Create a result dict to store results in
result = dict()
# Loop through all the names, and add a row for each name, pulling data from
# each dict we created in the beginning
for name in names:
result[name] = [ d.get(name, 0) for d in dicts ]
# Return, secure in the knowledge of a job well done. :-)
return result
# Use the function:
resultDict = combine(arrayOne, arrayTwo, arrayThree)