Python脚本从CSV文件读取

2 投票
1 回答
10554 浏览
提问于 2025-04-16 00:22
           "Type","Name","Description","Designation","First-term assessment","Second-term assessment","Total"
           "Subject","Nick","D1234","F4321",10,19,29
           "Unit","HTML","D1234-1","F4321",18,,
           "Topic","Tags","First Term","F4321",18,,
           "Subtopic","Review of representation of HTML",,,,,

以上所有内容都是来自一个Excel表格,这个表格被转换成了CSV格式,所以上面的内容就是这个CSV文件的内容。

你会注意到,表头有七列,而下面的数据各不相同。

我有一个脚本可以用Python生成这些内容,脚本如下:

 from django.db import transaction
 import sys
 import csv
 import StringIO



 file = sys.argv[1]
 no_cols_flag=0
 flag=0
 header_arr=[]


 print file
 f = open(file, 'r')



while (f.readline() != ""):
  for i in [line.split(',') for line in open(file)]: # split on the separator
    print "==========================================================="
    row_flag=0
    row_d=""
    for j in i: # for each token in the split string
      row_flag=1
      print j


      if j:
        no_cols_flag=no_cols_flag+1
        data=j.strip()
        print j

    break

如何修改上面的脚本,让它能说明这些数据属于哪个特定的列标题呢?

谢谢!

1 个回答

11

你正在导入csv模块,但却从来没有使用它。为什么呢?

如果你这样做:

import csv
reader = csv.reader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.reader(open(file, newline=""), dialect="excel")

你会得到一个reader对象,这个对象会包含你需要的所有内容;第一行会是表头,后面的行则是对应的数据。

更好的做法可能是(如果我理解得没错):

import csv
reader = csv.DictReader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.DictReader(open(file, newline=""), dialect="excel")

这个DictReader可以被遍历,它会返回一系列的dict,这些字典使用列名作为键,后面的数据作为值,所以

for row in reader:
    print(row)

将会输出

{'Name': 'Nick', 'Designation': 'F4321', 'Type': 'Subject', 'Total': '29', 'First-term assessment': '10', 'Second-term assessment': '19', 'Description': 'D1234'}
{'Name': 'HTML', 'Designation': 'F4321', 'Type': 'Unit', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'D1234-1'}
{'Name': 'Tags', 'Designation': 'F4321', 'Type': 'Topic', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'First Term'}
{'Name': 'Review of representation of HTML', 'Designation': '', 'Type': 'Subtopic', 'Total': '', 'First-term assessment': '', 'Second-term assessment': '', 'Description': ''}

撰写回答