我试图加载一些数据到熊猫数据帧,但.txt文件有点奇怪。它在头两个标题周围有引号,但其余的没有,当我把它读入pandas数据框时,它会把所有的数据和列名放在第一列中,用“\t”分隔,我相信这在python中意味着一个tab,但为什么要让它这样读呢
下面是从.txt文件复制的几行数据
"Notes" "Cancer Sites" "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts Mortality Population Mortality Age-Adjusted Rate Incidence Counts Incidence Population Incidence Age-Adjusted Rate
"All Cancer Sites Combined" "0" 0.385 176256 96127579 181.476 469603 96127579 470.919
"Oral Cavity and Pharynx" "20010-20100" 0.242 2521 96127579 2.527 10717 96127579 10.437
"Lip" "20010" 0.046 16 96127579 0.016 352 96127579 0.358
这是我到目前为止的代码(仅供参考,不管我是否使用头文件,它都会做同样的事情)
df = pd.read_fwf("United States and Puerto Rico Cancer Statistics.txt", headers = None)
当我打印df
时,我得到这个作为标题。。。你知道吗
"Notes" "Cancer Sites" "Cancer Sites Code" Mortality-Incidence Age-Adjusted Rate Ratio Death Counts Mortality Population Mortality Age-Adjusted Rate Incidence Counts Incidence.1 Population.1 Incidence.2 Age-Adjusted.1 Rate.1
这是我绘制df
时的前两行数据
0 "All Cancer Sites Combined"\t"0"\t0.385\t17625... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 "Oral Cavity and Pharynx"\t"20010-20100"\t0.24... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
目前没有回答
相关问题 更多 >
编程相关推荐