列表理解中的一个条目出现Python不需要的unicodedecoderror异常

#! /usr/bin/python # -*- coding: utf8 -*- import csv import re with open('test.csv', 'wb') as output_file: wr = csv.writer(output_file, delimiter=',', quoting=csv.QUOTE_NONE) # the following corresponds to reading from a shift_jis encoded csv files "日付,直流電流計測①,直流電流計測②" # 直流電流計測① is throwing an exception when decoded but it is a valid character according to # http://www.rikai.com/library/kanjitables/kanji_codes.sjis.shtml list_of_row_values = ['\x93\xfa\x95t', '\x92\xbc\x97\xac\x93d\x97\xac\x8cv\x91\xaa\x87@', '\x92\xbc\x97\xac\x93d\x97\xac\x8cv\x91\xaa\x87A'] # take away the last character in entry two, and three, and it would work # but that means I know all the bad characters before hand #list_of_row_values = ['\x93\xfa\x95t', '\x92\xbc\x97\xac\x93d\x97\xac\x8cv\x91\xaa', '\x92\xbc\x97\xac\x93d\x97\xac\x8cv\x91\xaa'] try: list_of_unicode_row_values = [str.decode('shift_jis') for str in list_of_row_values] except UnicodeDecodeError: # Question: what if I want to just ignore the character that cannot be decoded and still get the list # of "日付,直流電流計測,直流電流計測" as unicode? # right now, list_of_unicode_row_values would remain undefined, and the next line will # have a NameError print 'UnicodeDecodeError' pass # do a regex explanation to translate one column heading value list_of_translated_unicode_row_values = \ [re.sub('日付'.decode('utf-8'), 'Date Time', str) for str in list_of_unicode_row_values] list_of_translated_row_values = [unicode_str.encode('shift_jis') for unicode_str in list_of_translated_unicode_row_values] wr.writerow(list_of_translated_row_values)

1条回答

网友

1楼 · 发布于 2024-05-14 00:36:49

通常，可以使用errors='ignore'跳过无效字符：

list_of_unicode_row_values = [str.decode('shift_jis', errors='ignore') for str in list_of_row_values]

这将导致list_of_unicode_row_values中的以下条目：

^{pr2}$

但是在您的特定情况下，您使用了错误的编码。Python的shift_jis编码符合jisx0208标准，而字符①存在于较新的jisx0213标准中。要使用后者，只需使用shift_jisx0213编码：

list_of_unicode_row_values = [str.decode('shift_jisx0213') for str in list_of_row_values]

您将获得以下条目：

日付
直流電流計測①
直流電流計測②

一如预期。在

相关问题更多 >

编程相关推荐

热门问题

热门文章