在Python中将非表位/分块数据转换为嵌套字典

2024-05-15 12:30:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据块,看起来像这样:

>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463

>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276

每一块都用空格隔开。 我要做的是将它们转换为嵌套字典,如下所示:

{ 'Head1': {'foo': '0 1.10699e-05 2.73049e-05',
            'bar': '0.939121 0.0173732 0.0119144',
            'qux': '0 2.34787e-05 0.0136463'},
  'Head2': {'foo': '0 0.00118929 0.00136993',
             'bar': '0.0610655 0.980495 0.997179',
             'qux': '0.060879 0.982591 0.974276'}
}

用Python怎么做? 我不知道该怎么办:

def parse():
    caprout="tmp.txt"
    with open(caprout, 'r') as file:
        datalines = (ln.strip() for ln in file)
        for line in datalines:
            if line.startswith(">Head"):
                print line
            elif not line.strip():
                print line
            else:
                print line
    return

def main()
    parse()
    return 

if __name__ == '__main__'
parse()

Tags: forfooparsedeflinebarfilestrip
2条回答

文件:

[sgeorge@sgeorge-ld1 tmp]$ cat tmp.txt 
>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463

>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276

脚本:

[sgeorge@sgeorge-ld1 tmp]$ cat a.py 
import json
dict_ = {}

def parse():
  caprout="tmp.txt"
  with open(caprout, 'r') as file:
  datalines = (ln.strip() for ln in file)
  for line in datalines:
   if line != '':
     if line.startswith(">Head"):
       key = line.replace('>','')
       dict_[key] = {}
     else:
       nested_key = line.split(' ',1)[0]
       value = line.split(' ',1)[1]
       dict_[key][nested_key] = value
  print json.dumps(dict_)
parse()

执行:

[sgeorge@sgeorge-ld1 tmp]$ python a.py  | python -m json.tool
{
"Head1": {
    "bar": "0.939121 0.0173732 0.0119144", 
    "foo": "0 1.10699e-05 2.73049e-05", 
    "qux": "0 2.34787e-05 0.0136463"
}, 
"Head2": {
    "bar": "0.0610655 0.980495 0.997179", 
    "foo": "0 0.00118929 0.00136993", 
    "qux": "0.060879 0.982591 0.974276"
}
}

这是我能想到的最简单的解决方案:

mainDict = dict()
file = open(filename, 'r')
for line in file:
    line = line.strip()
    if line == "" :
        continue
    if line.find("Head") :
        lastBlock = line
        mainDict[lastBlock] = dict()
        continue
    splitLine = line.partition(" ")
    mainDict[lastBlock][splitLine[0]] = splitLine[2]

相关问题 更多 >