如何用Python合并CSV字符串中的字段?
我想用Python把CSV文件中每一行的三个字段合并起来。这本来很简单,但有些字段是用双引号括起来的,而且里面还包含逗号。下面是一个例子:
,,Joe,Smith,New Haven,CT,"Moved from Portland, CT",,goo,
有没有什么简单的方法可以把每一行的第7到第9个字段合并起来呢?并不是所有的行都有用双引号括起来的逗号。
谢谢。
4 个回答
1
在Python中,有一个内置模块可以用来解析CSV文件:
3
你可以使用csv模块来处理这些复杂的操作:http://docs.python.org/library/csv.html
你没有具体说明你想怎么合并这些列;我想你不希望合并后的内容是“从波特兰搬到CTgoo”。下面的代码可以让你指定一个分隔符(比如", "
),并且能够处理空白字段。
[transcript of session]
prompt>type merge.py
import csv
def merge_csv_cols(infile, outfile, startcol, numcols, sep=", "):
reader = csv.reader(open(infile, "rb"))
writer = csv.writer(open(outfile, "wb"))
endcol = startcol + numcols
for row in reader:
merged = sep.join(x for x in row[startcol:endcol] if x.strip())
row[startcol:endcol] = [merged]
writer.writerow(row)
if __name__ == "__main__":
import sys
args = sys.argv[1:6]
args[2:4] = map(int, args[2:4])
merge_csv_cols(*args)
prompt>type input.csv
1,2,3,4,5,6,7,8,9,a,b,c
1,2,3,4,5,6,,,,a,b,c
1,2,3,4,5,6,7,8,,a,b,c
1,2,3,4,5,6,7,,9,a,b,c
prompt>\python26\python merge.py input.csv output.csv 6 3 ", "
prompt>type output.csv
1,2,3,4,5,6,"7, 8, 9",a,b,c
1,2,3,4,5,6,,a,b,c
1,2,3,4,5,6,"7, 8",a,b,c
1,2,3,4,5,6,"7, 9",a,b,c
10
像这样吗?
import csv
source= csv.reader( open("some file","rb") )
dest= csv.writer( open("another file","wb") )
for row in source:
result= row[:6] + [ row[6]+row[7]+row[8] ] + row[9:]
dest.writerow( result )
示例
>>> data=''',,Joe,Smith,New Haven,CT,"Moved from Portland, CT",,goo,
... '''.splitlines()
>>> rdr= csv.reader( data )
>>> row= rdr.next()
>>> row
['', '', 'Joe', 'Smith', 'New Haven', 'CT', 'Moved from Portland, CT', '', 'goo', '' ]
>>> row[:6] + [ row[6]+row[7]+row[8] ] + row[9:]
['', '', 'Joe', 'Smith', 'New Haven', 'CT', 'Moved from Portland, CTgoo', '']