解析文件并将数据存储在列表中

2024-03-29 12:28:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样的文件“as.txt文件““

Sr.No.      Name        Enrollment Number   CGPA        Year        
1.          XYZ     1101111             7.1     2014        
2.          ZYX     1101113             8.2     2014        
3.          Abc     1010101             9.1     2014        

我想解析这个文件并将数据存储在一个列表中。我想提取每一行并检查其注册号,如果注册号以11开头,则将其保存在secondyearlist中,否则保存在firstyearlist中。你知道吗

这是我尝试过的,但我认为我错了。你知道吗

import struct

with open("as.txt") as f:
    # skip first two lines (containing header) and split on whitespace
    # this creates a nested list like: [[val1, i1, i2], [val2, i1, i2]]
    lines = [x.split() for x in f.readlines()[2:]
    # use the list to create the dict, using first item as key, last as values
    dict((x[0], x[1:])for x in lines)
f.close()

请帮我做这个。你知道吗


Tags: 文件theintxtforasdictlist
3条回答

目前还不清楚您希望如何存储问题中的变量。你知道吗

这将读取文件,跳过前两行,并将数据存储在两个字典中:

fir_y = {} #Sets the variables
sec_y = {}
with open("as.txt") as f: #Opens the file
    raw = f.read().split("\n")[2::] #Reads the file and splits it by newlines
    for v in raw:
        var = v.split(" ")
        if var[2][0:2] == "11": #If enrollment number starts with 11
            sec_y[var[1]] = [var[2],var[3],var[4]]
            #dict[key] = value
        else:
            fir_y[var[1]] = [var[2],var[3],var[4]]


{'Abc': ['1010101', '9.1', '2014']}
{'XYZ': ['1101111', '7.1', '2014'], 'ZYX': ['1101113', '8.2', '2014']}

或者,您可以将其存储为列表。几乎是一样的,您只需使用.append():

fir_y = []
sec_y = []
with open("as.txt") as f:
    raw = f.read().split("\n")[2::]
    for v in raw:
        var = v.split(" ")
        if var[2][0:2] == "11":
            sec_y.append([var[1],var[2],var[3],var[4]])
        else:
            fir_y.append([var[1],var[2],var[3],var[4]])

[['Abc', '1010101', '9.1', '2014']]
[['XYZ', '1101111', '7.1', '2014'], ['ZYX', '1101113', '8.2', '2014']]

另外,当您使用“with open”(“\uuuux”,“\uux”)作为x”打开文件时,您不需要在之后关闭文件。它会自动关闭。你知道吗

可能的解决方案

有许多可能的解决方案,这一个说明了几个典型的结构

fname = "as.txt"
with open(fname) as f:
    # skip first line (containing header)
    header = f.next() #this has just read one line (header)
    print "header", header # just to show, we have read the header line, not really necessary
    # this creates a list of records with each record being: [srno, name, enrolment, cgpa, year]
    records = [line.split() for line in f]
    # initialize resulting lists
    y_11 = []
    y_others = []
    # loop over records
    # we use value unpacking, each element of record is assigned to one variable
    for srno, name, enrolment, cgpa, year in records:
        if enrolment.startswith("11"):
            y_11.append([srno, name, enrolment, float(cgpa), int(year)])
        else:
            y_others.append([srno, name, enrolment, float(cgpa), int(year)])
# note, as we have left the `with` block, the `f.close()` was done automatically
assert f.closed # this assert would raise an exception if the `f.closed` would not be True

# print the results
print "y_11", y_11
print "y_other", y_others

叫它吧

$ python file2lst.py 
header Sr.No.      Name        Enrollment Number   CGPA        Year        

y_11 [['1.', 'XYZ', '1101111', 7.1, 2014], ['2.', 'ZYX', '1101113', 8.2, 2014]]
y_other [['3.', 'Abc', '1010101', 9.1, 2014]]

很少评论

f.next()-阅读下一行文字

有了一个文件描述符,循环就可以遍历它们。所以你不用打电话

lines = f.readlines()

但你也可以做到:

lines = list(f)

在所有情况下,都会返回行列表。你知道吗

在for循环中迭代时,会使用next()方法隐藏对iterable的调用:

lines = []
for line in f:
    lines.append(line)

再一次,我们已经填充了行列表。你知道吗

我们可以使用iterable上的next()调用来实现同样的功能,在我们的例子中是openfiledescriptor。你知道吗

with open(fname) as f:
    lines = []
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)
    line = f.next()
    lines.append(line)

我们非常聪明,可以立即停止,否则一旦文件中的行用完,就会引发异常StopIterationfor循环自动捕获此异常并停止迭代。你知道吗

到现在为止,我们应该明白,通过调用header = f.next(),我们读出了第一行。下次在某个迭代中使用f时,它不会返回并跟随下一行,不再返回头。你知道吗

将值解包为变量

我们假设line.split()返回5个元素。你知道吗

我们可以在一个步骤中把所有5个元素分配到不同的变量中。你知道吗

record = ["a11", "b22", "c33", "d44", "e55"]
a, b, c, d, e, = record
print a
print b
# etc.

在我们的解决方案中,我们在for循环中使用它。你知道吗

上下文管理器会自动调用所创建变量的close()

处理文件的典型习惯用法如下:

fname = "something.txt"
with open(fname) as f:
    # process the file

# do not call `f.close()` as it gets closed at the moment inner `with` block is left.

这种with构造使用所谓的“上下文管理器”,它能够通过输入块(在with行上)并在块的末尾执行某些操作来装箱一些值,在我们的示例中,它调用close()

看来你想要的Sr不,我有名单和口述包括在内。你知道吗

y_10=[]
y_11=[]
with open("as.txt",'r') as f: # no need for f.close() when you use "with open" as the file is autonatically closed
   lines = [x.split() for x in f.readlines()[2:]]
   for line in lines: 
       if line[2].startswith("10"): # check if the 3rd element starts with "10"
           y_10.append(line) # if so add to year 10 list
       else:
           y_11.append(line) # else it starts with "11" so add to year eleven list
print  y_10,y_11
[['3.', 'Abc', '1010101', '9.1', '2014']] [['1.', 'XYZ', '1101111', '7.1', '2014'], ['2.', 'ZYX', '1101113', '8.2', '2014']]

# make dicts using zip, where the first element of each list is the key and the rest are the values
y_10_dict = dict(zip([x[0] for x in y_10], [y[1:] for y in y_10])) #
y_11_dict = dict(zip([x[0] for x in y_11], [y[1:] for y in y_11]))

print  y_10_dict,y_11_dict
{'3.': ['Abc', '1010101', '9.1', '2014']} {'2.': ['ZYX', '1101113', '8.2', '2014'], '1.': ['XYZ', '1101111', '7.1', '2014']}

相关问题 更多 >