CSV到字节到DF绕过UnicodeDecodeError:“utf8”编解码器无法解码位置0中的字节0xff:起始字节无效？

2条回答

网友

1楼 · 编辑于 2024-04-25 20:54:50

请尝试使用以下代码找到正确的编码：

# import the chardet library
import chardet 

# use the detect method to find the encoding
# 'rb' means read in the file as binary
with open(your_file, 'rb') as file:
    print(chardet.detect(file.read()))

但是，不能保证找到编码，因为上下文可能包含不同的编码或不同的语言，但是，如果它仅由1个代码编码，则可以看到这一点

pip(3) install chardet

如果你没有安装

编辑1：下面是找到正确编码的另一种方法。如果上述问题没有解决，这可能会有所帮助：

from encodings.aliases import aliases
alias_values = set(aliases.values())

for value in alias_values:
    try:
        df = pd.read_csv(your_file, encoding=value) # or pd.read_excel
        print(value)
    except:
        continue

网友

2楼 · 编辑于 2024-04-25 20:54:50

这把它修好了。它将csv读取到一个数据帧中，没有unicode错误

df = pd.read_csv(r'\\blah\blah2\csv.csv', encoding='latin1')

相关问题更多 >

编程相关推荐

热门问题

热门文章

CSV到字节到DF绕过UnicodeDecodeError:“utf8”编解码器无法解码位置0中的字节0xff:起始字节无效？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >