python中CSV的双引号换行

2024-05-23 21:21:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下格式的csv文件:

"4931286","Lotion","New York","Bright color, yellow with 5" long
20% nylon"
"931286","Shampoo","New York","Dark, yellow with 10" long
20% nylon"
"3931286","Conditioner","LA","Bright color, yellow with 5" long
50% nylon"

上面的数据应该是3行4列:ID、productname、location和description。可以看到,每一行的描述中都有新行。在

我一直在搜索其他相关的stackoverflow问题,但似乎没有一个解决方案能解决这个问题。在

以下是我的尝试:

^{pr2}$

结果如下:

['4931286"', 'Lotion', 'New York', 'Bright color, yellow with 5 long']
   ['20% nylon']

但是,我想要的是

['4931286"', 'Lotion', 'New York', 'Bright color, yellow with 5 long 20% nylon']

我怎么能做到这一点?python应该有办法吗?在


Tags: 文件csvnew格式withlongcolordark
2条回答

每两行迭代一次怎么样

import csv
from StringIO import StringIO
from itertools import izip

def pairwise(iterable):
    "s -> (s0, s1), (s2, s3), (s4, s5), ..."
    a = iter(iterable)
    return izip(a, a)


file = StringIO(""""4931286","Lotion","New York","Bright color, yellow with 5" long
20% nylon"
"931286","Shampoo","New York","Dark, yellow with 10" long
20% nylon"
"3931286","Conditioner","LA","Bright color, yellow with 5" long
50% nylon"
""")

reader = csv.reader(file,quotechar='"', delimiter=',',quoting=csv.QUOTE_ALL, skipinitialspace=True)
for row, row2 in pairwise(reader):
    row[-1] = ' '.join([row[-1], row2[0]])
    print(row)

# Output
['4931286', 'Lotion', 'New York', 'Bright color, yellow with 5 long 20% nylon"']
['931286', 'Shampoo', 'New York', 'Dark, yellow with 10 long 20% nylon"']
['3931286', 'Conditioner', 'LA', 'Bright color, yellow with 5 long 50% nylon"']

数据不是CSV格式。在

CSV中的"必须与\类似的"Bright color, yellow\n with 5\" long 20% nylon"进行转义。在

如果"仅用于英寸(前缀为数字),请尝试以下操作:

import re
data = re.sub(r'([0-9])"(?![,\n])', r'\1\\"', data)

如果前缀为数字,则此正则表达式将用\"替换所有"

然后用csv.reader解析数据

编辑:由于MaxU's suggestion更改了正则表达式。在

相关问题 更多 >