Pandas:read_csv表示“空格分隔”

SICcode Catcode Category SICname MultSIC 0111 A1500 Wheat, corn, soybeans and cash grain Wheat X 0112 A1600 Other commodities (incl rice, peanuts) Rice X 0115 A1500 Wheat, corn, soybeans and cash grain Corn X 0116 A1500 Wheat, corn, soybeans and cash grain Soybeans X 0119 A1500 Wheat, corn, soybeans and cash grain Cash grains, NEC X 0131 A1100 Cotton Cotton X 0132 A1300 Tobacco & Tobacco products Tobacco X

2条回答

网友

1楼 · 编辑于 2024-05-14 01:25:02

如果df = pd.read_csv('file.txt', sep='\t')返回一个只有一列的数据帧，那么file.txt显然没有使用制表符作为分隔符。您的数据可能只使用空格作为分隔符。那样的话你可以试试

df = pd.read_csv('data', sep=r'\s{2,}')

它使用regex模式\s{2,}作为分隔符。此正则表达式匹配2个或多个空白字符。在

^{pr2}$
如果这不起作用，请张贴print(repr(open(file.txt, 'rb').read(100))。这将向我们展示file.txt的前100个字节的明确表示。在

网友
2楼 · 编辑于 2024-05-14 01:25:02

如果csv中的数据用Tabulator分隔，可以尝试将sep="\t"添加到^{}中。在
import pandas as pd df = pd.read_csv('test/a.csv', sep="\t") print df SICcode Catcode Category SICname \ 0 111 A1500 Wheat, corn, soybeans and cash grain Wheat 1 112 A1600 ther commodities (incl rice, peanuts) Rice 2 115 A1500 Wheat, corn, soybeans and cash grain Corn 3 116 A1500 Wheat, corn, soybeans and cash grain Soybeans 4 119 A1500 Wheat, corn, soybeans and cash grain Cash grains, NEC 5 131 A1100 Cotton Cotton 6 132 A1300 Tobacco & Tobacco products Tobacco MultSIC 0 X 1 X 2 X 3 X 4 X 5 X 6 X

相关问题更多 >

编程相关推荐

热门问题

热门文章