如何解决ms-excel可读逗号corupt文件在csv文件中的Pandas(作品宏代码提供)?

2024-05-14 14:02:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个逗号损坏的csv文件,它可以被microsoftexcel文件读取,但是不能用pandas dataframe很好地读取,我有宏解决方案,但是我也想要在python中工作的解决方案,这里是我的前5行数据

mathematicians,occupation,country of citizenship,place of birth,date of death,educated at,employer,place of death,member of,employer,doctoral advisor,"languages spoken, written or signed",academic degree,doctoral student,manner of death,position held,field of work,award received,Erdős number,instance of,sex or gender,approx. date of birth,day of birth,month of birth,year of birth,approx. date of death,day of death,month of death,year of death
Roger Joseph Boscovich,"['physicist', 'astronomer', 'mathematician', 'philosopher', 'diplomat', 'poet', 'theologian', 'priest', 'polymath', 'historian', 'scientist', 'writer', 'cleric', 'university teacher']",['Republic of Ragusa'],"Dubrovnik, Republic of Ragusa",13 February 1787,['Pontifical Gregorian University'],['Pontifical Gregorian University'],"['Milan', 'Habsburg Empire']","['Royal Society', 'Russian Academy of Sciences', 'Russian Academy of Sciences']",['Pontifical Gregorian University'],,['Latin'],,,,,,['Fellow of the Royal Society'],,['human'],['male'],False,18,May,1711,False,13,February,1787
Emma Previato,['mathematician'],"['United States of America', 'Italy']",Badia Polesine,,"['Harvard University', 'University of Padua']","['Boston University', 'University of Padua']",,['American Mathematical Society'],"['Boston University', 'University of Padua']",['David Mumford'],,,,,,,,,['human'],['female'],False,,,1952,False,,,
Feodor Deahna,['mathematician'],,,1844,,,,,,,,,,,,['differential geometry'],,,['human'],['male'],False,,,1815,False,,,1844
Denis Henrion,"['publisher', 'mathematician']",['France'],,1640,,,,,,,['French'],,,,,,,,['human'],['male'],True,,,1500,False,,,1640

这是有效的宏观解决方案

For i1 = 1 To 9000
dump = ""
For i2 = 1 To 29
dump = dump & "," & Cells(i1, i2).Value
Next i2
Cells(i1, 30).Value = Mid(dump, 2, 1000)
Next i1

如何将宏解转换为大熊猫解?你知道吗


Tags: offalsedate解决方案dumpgregorianmalebirth
1条回答
网友
1楼 · 发布于 2024-05-14 14:02:02

要转换某些单元格中使用的列表表示形式,请执行以下操作:

import pandas as pd    
import csv 
import ast

data = []

with open('input.csv', 'rb') as f_input:
    for row in csv.reader(f_input):
        for index, v in enumerate(row):
            if v.startswith('['):
                row[index] = ', '.join(ast.literal_eval(v))
        data.append(row)

print pd.DataFrame(data)

例如,包含文本的单元格:

['United States of America', 'Italy']

将变成:

United States of America, Italy

数据帧将显示为:

                       0                                                  1                                2                              3                 4                                        5                                       6                       7                                                  8                                       9                 10                                   11               12                13               14             15                     16                           17             18           19             20                     21            22              23             24                     25            26              27             28
0          mathematicians                                         occupation           country of citizenship                 place of birth     date of death                              educated at                                employer          place of death                                          member of                                employer  doctoral advisor  languages spoken, written or signed  academic degree  doctoral student  manner of death  position held          field of work               award received  Erdős number  instance of  sex or gender  approx. date of birth  day of birth  month of birth  year of birth  approx. date of death  day of death  month of death  year of death
1  Roger Joseph Boscovich  physicist, astronomer, mathematician, philosop...               Republic of Ragusa  Dubrovnik, Republic of Ragusa  13 February 1787          Pontifical Gregorian University         Pontifical Gregorian University  Milan, Habsburg Empire  Royal Society, Russian Academy of Sciences, Ru...         Pontifical Gregorian University                                                  Latin                                                                                            Fellow of the Royal Society                       human           male                  False            18             May           1711                  False            13        February           1787
2           Emma Previato                                      mathematician  United States of America, Italy                 Badia Polesine                    Harvard University, University of Padua  Boston University, University of Padua                                              American Mathematical Society  Boston University, University of Padua     David Mumford                                                                                                                                                                                   human         female                  False                                         1952                  False                                             
3           Feodor Deahna                                      mathematician                                                                              1844                                                                                                                                                                                                                                                                                                                                differential geometry                                                    human           male                  False                                         1815                  False                                         1844
4           Denis Henrion                           publisher, mathematician                           France                                             1640                                                                                                                                                                                                                                                     French                                                                                                                                              human           male                   True                                         1500                  False                                         1640

相关问题 更多 >

    热门问题