如何用Python转义所有单双引号,特别是.csv列?

2024-06-07 03:13:29 发布

您现在位置:Python中文网/ 问答频道 /正文

  • 使用Python2.7.6
  • 需要不使用熊猫库的解决方案

我有一个带有特定(文本)列的.csv文件,其单元格偶尔会包含一个双引号(“)。当转换为ArcMap中的shapefile时,这些单双引号会导致错误的转换。他们一定是被“逃脱”了。在

我需要一个脚本来编辑.csv,以便它:

  1. 将“的所有实例替换为”。在
  2. 用双引号将每个单元格括起来。在

我的剧本:

import csv

with open(Source_CSV, 'r') as file1, open('OUTPUT2.csv','w') as file2:
    reader = csv.reader(file1)  

    # Write column headers without quotes
    headers = reader.next()
    str1 = ''.join(headers)
    writer = csv.writer(file2)
    writer.writerow(headers)

    # Write all other rows with quotes
    writer = csv.writer(file2, quoting=csv.QUOTE_ALL)
    for row in reader:
        writer.writerow(row)

此脚本成功地跨所有列完成上述两个任务。在

例如,原始的.csv:

^{pr2}$

变成这样:

Column 1, Column 2, Column 3, Column 4
"Fred"," Flintstone"," 5'10"""," black hair"
"Wilma"," Flintstone"," five feet seven inches"," red hair"
"Barney"," Rubble"," 5 feet 2"" inches"," blond hair"
"Betty"," Rubble"," 5 foot 7"," black hair"

但是,如果我只想在第3列中完成这项工作(实际上偶尔有双引号的那一列)?在

换言之,我怎么能结束这个。。。?在

Column 1, Column 2, Column 3, Column 4
Fred, Flintstone," 5'10""", black hair
Wilma, Flintstone," five feet seven inches", red hair
Barney, Rubble," 5 feet 2"" inches", blond hair
Betty, Rubble," 5 foot 7", black hair

Tags: csv脚本withcolumnreaderfile2writerheaders
2条回答

只引用包含双引号的字段就足够了吗?如果是这样,csv模块的默认行为将起作用,尽管我在解析输入文件时添加了skipinitialspace=True,这样它就不会将逗号后面的空格视为重要的。在

同样,根据csv模块文档,我以二进制模式打开了这些文件。在

import csv

with open('input.csv','rb') as file1, open('output.csv','wb') as file2:
    reader = csv.reader(file1,skipinitialspace=True)  
    writer = csv.writer(file2)

    for row in reader:
        writer.writerow(row)

输入:

^{pr2}$

输出:

Column 1,Column 2,Column 3,Column 4
Fred,Flintstone,"5'10""",black hair
Wilma,Flintstone,five feet seven inches,red hair
Barney,Rubble,"5 feet 2"" inches",blond hair
Betty,Rubble,5 foot 7,black hair

如果需要引用第3列的每一行,则可以手动引用。我已将csv模块设置为不引用任何内容,并将引号字符设置为不应出现在输入中的不可打印控制字符:

import csv

with open('input.csv','rb') as file1, open('output.csv','wb') as file2:
    reader = csv.reader(file1,skipinitialspace=True)
    writer = csv.writer(file2,quoting=csv.QUOTE_NONE,quotechar='\x01')

    # Write column headers without quotes
    headers = reader.next()
    writer.writerow(headers)

    # Write 3rd column with quotes
    for row in reader:
        row[2] = '"' + row[2].replace('"','""') + '"'
        writer.writerow(row)

输出:

Column 1,Column 2,Column 3,Column 4
Fred,Flintstone,"5'10""",black hair
Wilma,Flintstone,"five feet seven inches",red hair
Barney,Rubble,"5 feet 2"" inches",blond hair
Betty,Rubble,"5 foot 7",black hair

你可以试试这个:

    import csv
with open("file.csv", "rU") as fin:
    words = fin.readlines()

with open("cleaned.csv", "w") as fout:
    writer = csv.writer(fout, quoting=csv.QUOTE_ALL, quotechar = '"', doublequote = True)
    for row in words:
        row = row.replace("\n", "")
        newrow = []
        for word in row.split(","): 
            newrow.append(word.strip())
        writer.writerow(newrow)

首先使用open尝试将其作为一个简单的文本文件来读取,以绕过格式错误的csv文件。然后我们把它正常地写入一个csv文件。在

相关问题 更多 >