Python:使用Excel CSV文件读取特定列和行

2 投票
3 回答
45424 浏览
提问于 2025-04-17 18:22

我想问的是,虽然我可以读取整个csv文件,但如果我只想打印特定的行和列,该怎么做呢?

想象一下这就像在Excel里操作:

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25    

这是我现在的内容:

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

我只需要读取B列和D列的所有数据,像这样:

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4                

编辑过的

没有错误

有没有什么好的方法来解决这个问题?我尝试的所有方法都不奏效。非常感谢任何帮助或建议。

3 个回答

2

试试这个

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns
11

我希望你听说过Pandas这个工具,它是用来做数据分析的。

下面的代码可以帮助你读取数据的列,不过关于读取行的部分,你可能需要更详细的解释。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io 
3

如果你还是卡住了,其实你并不一定要用CSV模块来读取文件,因为所有的CSV文件其实就是用逗号分隔的字符串。所以,如果你想简单点,可以试试下面这个方法,这样你就能得到一个包含(州名、心脏病发病率、艾滋病诊断率)这样的元组列表。

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

不过要注意,如果你想进行数据分析的话,你需要自己去跳过那些表头行。

撰写回答