我有一个以逗号分隔的csv文件,该文件由Mac数字导出,我试图将其读入数据帧,但收到一条错误消息:
df = pd.read_csv('game.csv', dtype={"rating": str}, error_bad_lines='ignore', encoding='utf8', sep=',')
错误消息是:
Traceback (most recent call last):
File "/Users/congminmin/nlp/data_collection/crawler/data/game/test.py", line 5, in <module>
df = pd.read_csv('game_app_apple.missing.url.csv', dtype={"rating": str}, error_bad_lines='ignore', encoding='utf8', sep=',')
File "/Users/congminmin/.venv/data_collection/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Users/congminmin/.venv/data_collection/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/Users/congminmin/.venv/data_collection/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "/Users/congminmin/.venv/data_collection/lib/python3.7/site-packages/pandas/io/parsers.py", line 1114, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/Users/congminmin/.venv/data_collection/lib/python3.7/site-packages/pandas/io/parsers.py", line 1891, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 426, in pandas._libs.parsers.TextReader.__cinit__
ValueError: invalid literal for int() with base 10: 'ignore'
我的csv无效吗?但它是由数字产生的。即使我删除了dtype参数,它也会遇到同样的问题。如果我删除了错误\u bad\u lines='ignore',我会得到以下错误:
File "pandas/_libs/parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 2
通过数字导出的csv是逗号分隔的,我想读入数据帧并以制表符分隔的形式输出,但遇到了上面的问题
添加数据:原始数据为中文,上述代码中的“评级”实际上为评分' 实际数据的翻译如下:
我必须截图,因为stackoverflow将其识别为垃圾邮件:
目前没有回答
相关问题 更多 >
编程相关推荐