如何使用pandas或csv在每1015行具有相同标题之后读取python中csv文件的每一列?

2024-05-23 14:32:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个CSV文件,下面的文件中有类似的数据

Column 1| Car Make  |Mazda      |Car Model  |123456|
Customer| fname     |lname      | phone     |      |        
1       |abcdef     |iiiiii     |123456454                  
2       |eddfga     |jjjjjj     |123434325                  
3       |phadfa     |kkkk       |124141414                  
4       |adfkld     |llllll     |567575575                  
5       |dafadf     |mmmm       |898979778                  
                                
Column 2| Car Make| Nissan| Car Model| 789471|
Customer| fname   |lname  |phone                    
1       |gjjgjd   |pwldd  |7123097109                   
2       |rbnhf    |ggggg  |9827394829                   
3       |plgrd    |hhhhh  |9210313813                   
4       |ghurf    |jjjjj  |9876548311                   
5       |mjydd    |kkkkk  |8887654625                   
6       |kiure    |llllll |7775559994                   
7       |wriok    |mmmmm  |9993338881                   
8       |stije    |nnnnnn |1110002223                   
9       |ahlkg    |bbbbb  |7778889995                   
10      |cdegf    |vvvvv    |9993331999     

        

我需要阅读CSV文件(我正在使用Pandas)来构造和组合所有列,以及客户姓名和电话号码,以了解他们的品牌和型号。输出为以下文件

Column   |Customer| fname|  lname|  phone|  Car Make|   Car Model|
Column 1|   1|  abcdef| iiiiii| 123456454|  Mazda|  123456|
Column 1|   2|  eddfga| jjjjjj| 123434325|  Mazda|  123456|
Column 1|   3|  phadfa| kkkk|   124141414|  Mazda|  123456|
Column 1|   4|  adfkld| llllll| 567575575|  Mazda|  123456|
Column 1|   5|  dafadf| mmmm|   898979778|  Mazda|  123456|
Column 2|   1|  gjjgjd| pwldd|  7123097109| Nissan| 789471|
Column 2|   2|  rbnhf|  ggggg|  9827394829| Nissan| 789471|
Column 2|   3|  plgrd|  hhhhh|  9210313813| Nissan| 789471|
Column 2|   4|  ghurf|  jjjjj|  9876548311| Nissan| 789471|
Column 2|   5|  mjydd|  kkkkk|  8887654625| Nissan| 789471|
Column 2|   6|  kiure|  llllll| 7775559994| Nissan| 789471|
Column 2|   7|  wriok|  mmmmm|  9993338881| Nissan| 789471|
Column 2|   8|  stije|  nnnnnn| 1110002223| Nissan| 789471|
Column 2|   9|  ahlkg|  bbbbb|  7778889995| Nissan| 789471|
Column 2|   10| cdegf|  vvvvv|  9993331999| Nissan| 789471|

下面是我尝试过的代码,我无法进一步了解如何循环第2列并在解析第1列后获取详细信息

import pandas as pd

hdf = pd.read_csv("file.csv", nrows=0)
first = list(hdf)
df = pd.read_csv("file.csv", skiprows=1) 
df.columns = ['A','B','C','D']
out_df = pd.DataFrame(columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'make', 'model'])
out_df.Customer = i.A
out_df.fname = i.B
out_df.lname = i.C
out_df.phone = i.D
out_df.Column = first[0]
out_df.make = first[2]
out_df.model = first[4]
out_df.to_csv('new_file.csv')

这就是我无法前进的地方。有没有其他类似的问题,我可以从中找到一些关于如何获得期望结果的想法

非常感谢您的帮助

谢谢大家!


Tags: 文件csvdfphonecolumncustomeroutcar
1条回答
网友
1楼 · 发布于 2024-05-23 14:32:49

我将使用csv.DictReader()并从文件中读取行,检查条件,即第一列是“Customer”或“column x”,并将值分配给列表。最后,将列表转换为所需的数据帧

输入csv文件test.csv

enter image description here

代码:

rows = []
with open('test.csv', 'rt') as f:
    for row in csv.DictReader(f, fieldnames=['A', 'B', 'C', 'D', 'E']):
        if row['A'].startswith('Column') and row['B'] == 'Car Make' and row['D'] == 'Car Model':
            column = row['A']
            make = row['C']
            model = row['E']
            customer = fname = lname = phone = ''
        elif row['A']!='Customer':
            customer = row['A']
            fname = row['B']
            lname = row['C']
            phone = row['D']

        if fname!='':
            rows.append([column, customer, fname, lname, phone, make, model])

df = pd.DataFrame(rows, columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'Car Make', 'Car Model'])
print(df)

结果:

      Column Customer   fname   lname       phone Car Make Car Model
0   Column 1        1  abcdef  iiiiii   123456454    Mazda    123456
1   Column 1        2  eddfga  jjjjjj   123434325    Mazda    123456
2   Column 1        3  phadfa    kkkk   124141414    Mazda    123456
3   Column 1        4  adfkld  llllll   567575575    Mazda    123456
4   Column 1        5  dafadf    mmmm   898979778    Mazda    123456
5   Column 2        1  gjjgjd   pwldd  7123097109   Nissan    789471
6   Column 2        2   rbnhf   ggggg  9827394829   Nissan    789471
7   Column 2        3   plgrd   hhhhh  9210313813   Nissan    789471
8   Column 2        4   ghurf   jjjjj  9876548311   Nissan    789471
9   Column 2        5   mjydd   kkkkk  8887654625   Nissan    789471
10  Column 2        6   kiure  llllll  7775559994   Nissan    789471
11  Column 2        7   wriok   mmmmm  9993338881   Nissan    789471
12  Column 2        8   stije  nnnnnn  1110002223   Nissan    789471
13  Column 2        9   ahlkg   bbbbb  7778889995   Nissan    789471
14  Column 2       10   cdegf   vvvvv  9993331999   Nissan    789471

相关问题 更多 >