列表上的循环操作不能仅在第一项上正确执行

list1 = [] list2 = [] with codecs.open('FILE.txt', "r", encoding="utf-8") as inputfile: list1 = [line.strip() for line in inputfile] list1 = [x.encode('utf-8') for x in list1] for item in list1: list2.append(item[45:]) z = open('NEWFILE.txt', 'w'); z.write("\n".join(list2)) z.close()

1条回答

网友

1楼 · 发布于 2024-05-23 15:31:25

UTF-8和第一行的3字节移位看起来非常像额外的BOM头

>>> from codecs import BOM_UTF8
>>> len(BOM_UTF8)
3

BOM表标题由大多数文本编辑器检测，并且不直接可见（除非使用文本编辑器）

我建议你像这样改变你的内环：

for item in list1:
    list2.append(item[45+len(codecs.BOM_UTF8) if item.startswith(codecs.BOM_UTF8) else 45:])

因此，如果行（第一行）以BOM头开始，则添加3个额外的字节

或者在编码完整字符串之前：

list1 = [(x[len(codecs.BOM_UTF8):] if x.startswith(codecs.BOM_UTF8) else x).encode('utf-8') for x in list1]

物料清单条形图代码取自此Q/A:Python load json file with UTF-8 BOM header

相关问题更多 >

编程相关推荐

热门问题

热门文章

列表上的循环操作不能仅在第一项上正确执行

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >