python中的矩阵文件到字典

2024-04-25 22:47:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文件matrix.txt,其中包含:

   A  B  C
A  1  2  3
B  4  5  6
C  7  8  9

我想读取文件的内容并将其存储在字典中,如下所示:

{('A', 'A') : 1, ('A', 'B') : 2, ('A', 'C') : 3,
 ('B', 'A') : 4, ('B', 'B') : 5, ('B', 'C') : 6,
 ('C', 'A') : 7, ('C', 'B') : 8, ('C', 'C') : 9}

Tags: 文件txt内容字典matrix
3条回答

pandas使它非常整洁。你知道吗

import pandas as pd

方法1

df = pd.read_table('matrix.txt', sep='  ')
>>> df
   A  B  C
A  1  2  3
B  4  5  6
C  7  8  9

d = df.to_dict()
>>> d
{'A': {'A': 1, 'B': 4, 'C': 7},
 'B': {'A': 2, 'B': 5, 'C': 8},
 'C': {'A': 3, 'B': 6, 'C': 9}}

new_d = {}
{new_d.update(g) for g in [{(r,c):v for r,v in v1.iteritems()} for c,v1 in d.iteritems()]}

>>> new_d
{('A', 'A'): 1,
 ('A', 'B'): 2,
 ('A', 'C'): 3,
 ('B', 'A'): 4,
 ('B', 'B'): 5,
 ('B', 'C'): 6,
 ('C', 'A'): 7,
 ('C', 'B'): 8,
 ('C', 'C'): 9}

方法2

df = pd.read_table('matrix.txt', sep='  ')
>>> df
   A  B  C
A  1  2  3
B  4  5  6
C  7  8  9

new_d = {}
for r, v in df.iterrows():
    for c, v1 in v.iteritems():
        new_d.update({(r,c): v1})

>>> new_d
{('A', 'A'): 1,
 ('A', 'B'): 2,
 ('A', 'C'): 3,
 ('B', 'A'): 4,
 ('B', 'B'): 5,
 ('B', 'C'): 6,
 ('C', 'A'): 7,
 ('C', 'B'): 8,
 ('C', 'C'): 9}

您可以使用itertools.product来创建键,使用文件头和转置后的第一列来创建键,然后只需将其余行压缩转换回其原始状态,并创建拆分子字符串的单个iterable。为了维持秩序,我们还需要使用OrderedDict

from collections import OrderedDict
from itertools import izip,  product, imap, chain

with open("matrix.txt") as f:
    head, zipped = next(f).split(), izip(*imap(str.split, f))
    cols = next(zipped)
    od = OrderedDict(zip(product(head, cols), chain.from_iterable(izip(*zipped))))

输出:

  OrderedDict([(('A', 'A'), '1'), (('A', 'B'), '2'), (('A', 'C'), '3'),
  (('B', 'A'), '4'), (('B', 'B'), '5'), (('B', 'C'), '6'), (('C', 'A'), '7'),
  (('C', 'B'), '8'), (('C', 'C'), '9')])

对于python3,只需使用mapzip。你知道吗

或者不转置并使用csv库:

from collections import OrderedDict
from itertools import izip,repeat
import csv

with open("matrix.txt") as f:
    r = csv.reader(f, delimiter=" ", skipinitialspace=1)
    head = repeat(next(r))
    od = OrderedDict((((row[0], k), v) for row in r 
                     for k, v in izip(next(head), row[1:])))

输出相同。你知道吗

下面的Python3函数将生成所有带有索引的矩阵项,与dict构造函数兼容:

def read_mx_cells(file, parse_cell = lambda x:x):
  rows = (line.rstrip().split() for line in file)
  header = next(rows)
  for row in rows:
    row_id = row[0]
    for col_id,cell in zip(header, row[1:]):
      yield ((row_id, col_id), parse_cell(cell))

with open('matrix.txt') as f:
  for x in read_mx_cells(f, int):
    print(x)
# ('A','A'),1
# ('A','B'),2
# ('A','C'),3 ...

with open('matrix.txt') as f:
  print(dict(read_mx_cells(f, int)))
# { ('A','A'): 1, ('A','B'): 2, ('A','C'): 3 ... } 
# Note that python dicts dont retain item order

相关问题 更多 >