如何正确读取csv格式错误的字符串？

1条回答

网友

1楼 · 发布于 2024-04-29 08:25:01

这几乎是令人尴尬的黑客，但似乎至少工作在您的问题中显示的样本输入。它的工作原理是对csvreader读取的每一行进行后处理，并尝试检测它们是否由于格式错误而被错误读取，然后进行纠正。你知道吗

import csv

def read_csv(filename):
    with open(filename, 'rb') as file:
        for row in csv.reader(file, skipinitialspace=True, quotechar=None):
            newrow = []
            use_a = True
            for a, b in zip(row, row[1:]):
                # Detect bad formatting.
                if (a.startswith('"') and not a.endswith('"')
                        and not b.startswith('"') and b.endswith('"')):
                    # Join misread field backs together.
                    newrow.append(', '.join((a,b)))
                    use_a = False
                else:
                    if use_a:
                        newrow.append(a)
                    else:
                        newrow.append(b)
                        use_a = True
            yield [field.replace('""', '"').strip('"') for field in newrow]

for row in read_csv('fmt_test2.csv'):
    print(row)

输出：

['1', '2', 'text1', 'Sample text "present" in csv, as this', '5']

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何正确读取csv格式错误的字符串？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >