pandas dataframe将所有数据列合并为一列

2024-04-26 10:34:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图加载一些数据到熊猫数据帧,但.txt文件有点奇怪。它在头两个标题周围有引号,但其余的没有,当我把它读入pandas数据框时,它会把所有的数据和列名放在第一列中,用“\t”分隔,我相信这在python中意味着一个tab,但为什么要让它这样读呢

下面是从.txt文件复制的几行数据

"Notes" "Cancer Sites"  "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts    Mortality Population    Mortality Age-Adjusted Rate Incidence Counts    Incidence Population    Incidence Age-Adjusted Rate
    "All Cancer Sites Combined" "0" 0.385   176256  96127579    181.476 469603  96127579    470.919
    "Oral Cavity and Pharynx"   "20010-20100"   0.242   2521    96127579    2.527   10717   96127579    10.437
    "Lip"   "20010" 0.046   16  96127579    0.016   352 96127579    0.358

这是我到目前为止的代码(仅供参考,不管我是否使用头文件,它都会做同样的事情)

df = pd.read_fwf("United States and Puerto Rico Cancer Statistics.txt", headers = None)

当我打印df时,我得到这个作为标题。。。你知道吗

"Notes" "Cancer Sites"  "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts    Mortality   Population  Mortality   Age-Adjusted    Rate    Incidence   Counts  Incidence.1 Population.1    Incidence.2 Age-Adjusted.1  Rate.1

这是我绘制df时的前两行数据

0   "All Cancer Sites Combined"\t"0"\t0.385\t17625...   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1   "Oral Cavity and Pharynx"\t"20010-20100"\t0.24...   NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

Tags: and文件数据txtdfageratenan