将.txt文件转换为Python字典

2024-06-09 07:58:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将这样的.txt文件转换为Python字典:

18.10.2021       List Display                                                    
-----------------------------
 Selected Documents:        3
-----------------------------
|  Document|Description |Lng|
-----------------------------
|  VLX82304|Unit 523435 |EN |
|  VLX82340|Self  339304|EN |
|  VLX98234|Can  522018 |EN |
-----------------------------

我想创建一个这样的词典:

MyDict = {
"Document": "VLX82304", "VLX82340", "VLX98234",
"Description":  "Unit 523435", "Self  339304", "Can  522018"
[...] }

我有以下资料:

fileInfo = {"Document", "Description", "Lng"}

# > CLEANING UP .txt FILE 

LocalFile_LINES = []      # list to store file lines
# Read file
with open(".txt", 'r') as fp:
    # read an store all lines into list
    LocalFile_LINES = fp.readlines()
    NumLines = len(LocalFile_LINES)

# Write file
with open("CLEANED.txt", 'w') as fp:
    # iterate each line
    for number, line in enumerate(LocalFile_LINES):
        # delete line 5 and 8. or pass any Nth line you want to remove
        if number not in [0,1,2,3,4,5,NumLines-1, NumLines]:
            # The "NumLines-1" removes the actual "------", whereas NuMLines removes a space at the end
            fp.write(line)

# Getting num lines of newly CLEANED .txt file
txtCLEANED = open("CLEANED.txt", "r")
NumLines_CLEANED = txtCLEANED.readlines()
CLEANED_len = len(NumLines_CLEANED)
listIndex = list( range(0,CLEANED_len-1) )    # Creates a series of numbers 


# > CONVERTED.CLEANED.txt FILE TO PY DICT

Delimited = []
with open("CLEANED.txt", 'r') as fp:
    for line in fp:
        Delimited = line.split("|")
        newItem = str( Delimited[1] )
        fileInfo["Document"].append( newItem )

但我在最后一行得到一个错误,当它应该是一个列表时,它说“TypeError:‘set’object不可订阅”

请任何人就如何解决这个问题提供意见


Tags: txtlenlinedescriptionopendocumentlistfile
1条回答
网友
1楼 · 发布于 2024-06-09 07:58:24

这个(或类似的东西)应该适用于这个用例。请注意,我使用了字符串而不是从文件中读取,因为它更易于测试

from pprint import pprint


file_contents = """
18.10.2021       List Display
              -
 Selected Documents:        3
              -
|  Document|Description |Lng|
              -
|  VLX82304|Unit 523435 |EN |
|  VLX82340|Self  339304|EN |
|  VLX98234|Can  522018 |EN |
              -
""".strip()

_, col_headers, cols, _ = file_contents.rsplit('              -', 3)
col_headers = [h.strip() for h in col_headers.strip('\n|').split('|')]
cols = [line.strip(' |').split('|') for line in cols.strip().split('\n')]

my_dict = dict(zip(col_headers, zip(*cols)))

pprint(my_dict)

输出:

{'Description': ('Unit 523435 ', 'Self  339304', 'Can  522018 '),
 'Document': ('VLX82304', 'VLX82340', 'VLX98234'),
 'Lng': ('EN', 'EN', 'EN')}

NB:如果您有一个文本文件,并且想要读取字符串内容,您可以按如下方式执行

with open('my_file.txt') as in_file:
    file_contents = in_file.read()

# file_contents should now be a string with the contents of the file

相关问题 更多 >