我有一个带有特定(文本)列的.csv文件,其单元格偶尔会包含一个双引号(“)。当转换为ArcMap中的shapefile时,这些单双引号会导致错误的转换。他们一定是被“逃脱”了。在
我需要一个脚本来编辑.csv,以便它:
我的剧本:
import csv
with open(Source_CSV, 'r') as file1, open('OUTPUT2.csv','w') as file2:
reader = csv.reader(file1)
# Write column headers without quotes
headers = reader.next()
str1 = ''.join(headers)
writer = csv.writer(file2)
writer.writerow(headers)
# Write all other rows with quotes
writer = csv.writer(file2, quoting=csv.QUOTE_ALL)
for row in reader:
writer.writerow(row)
此脚本成功地跨所有列完成上述两个任务。在
例如,原始的.csv:
^{pr2}$变成这样:
Column 1, Column 2, Column 3, Column 4
"Fred"," Flintstone"," 5'10"""," black hair"
"Wilma"," Flintstone"," five feet seven inches"," red hair"
"Barney"," Rubble"," 5 feet 2"" inches"," blond hair"
"Betty"," Rubble"," 5 foot 7"," black hair"
但是,如果我只想在第3列中完成这项工作(实际上偶尔有双引号的那一列)?在
换言之,我怎么能结束这个。。。?在
Column 1, Column 2, Column 3, Column 4
Fred, Flintstone," 5'10""", black hair
Wilma, Flintstone," five feet seven inches", red hair
Barney, Rubble," 5 feet 2"" inches", blond hair
Betty, Rubble," 5 foot 7", black hair
只引用包含双引号的字段就足够了吗?如果是这样,
csv
模块的默认行为将起作用,尽管我在解析输入文件时添加了skipinitialspace=True
,这样它就不会将逗号后面的空格视为重要的。在同样,根据
csv
模块文档,我以二进制模式打开了这些文件。在输入:
^{pr2}$输出:
如果需要引用第3列的每一行,则可以手动引用。我已将
csv
模块设置为不引用任何内容,并将引号字符设置为不应出现在输入中的不可打印控制字符:输出:
你可以试试这个:
首先使用open尝试将其作为一个简单的文本文件来读取,以绕过格式错误的csv文件。然后我们把它正常地写入一个csv文件。在
相关问题 更多 >
编程相关推荐