Python CSV 帮助

0 投票

3 回答

2377 浏览

提问于 2025-04-15 15:03

有时候我需要解析一个CSV格式的字符串，但我在处理带引号的逗号时遇到了麻烦。下面的代码就是个例子。我使用的是Python 2.4。

import csv
for row in csv.reader(['one",f",two,three']):
    print row

我得到了4个元素 ['one"', 'f"', 'two', 'three']，但我希望得到的是 ['one", f"', 'two', 'three']，或者至少3个元素。即使我尝试使用quotechar = '"'这个选项（根据文档这是默认设置），结果还是一样，我该如何忽略引号中的逗号呢？

编辑：感谢大家的回答，显然我把输入当成了CSV，最后我解析的是包含键值对的字符串（比如NAME, DESCR...）

这是输入：

NAME: "2801 chassis", DESCR: "2801 chassis, Hw Serial#: xxxxxxx, Hw Revision: 6.0", PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx

字符串处理键值对数据解析编程问题引号处理数据格式 csv 代码示例

3 个回答

你的输入字符串其实不是标准的CSV格式。相反，你的输入在每一行都包含了列名。如果你的输入看起来像这样：

NAME: "2801 chassis", DESCR: "2801 chassis, Hw Serial#: xxxxxxx, Hw Revision: 6.0",PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx
NAME: "2802 wroomer", DESCR: "2802 wroomer, Hw Serial#: xxxxxxx, Hw Revision: 6.0",PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx
NAME: "2803 foobars", DESCR: "2803 foobars, Hw Serial#: xxxxxxx, Hw Revision: 6.0",PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx

最简单的做法可能是先把所有的列名过滤掉，这样就能得到一个可以解析的CSV文件。不过，这样做是基于每一行的列都是一样的，顺序也相同。

但是，如果数据不太一致，你可能需要根据列名来解析。也许它看起来像这样：

NAME: "2801 chassis", PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx, DESCR: "2801 chassis, Hw Serial#: xxxxxxx, Hw Revision: 6.0"
NAME: "2802 wroomer", DESCR: "2802 wroomer, Hw Serial#: xxxxxxx, Hw Revision: 6.0",PID: CISCO2801 , VID: V03 , SN: xxxxxxxxx
NAME: "2803 foobars",  VID: V03 ,PID: CISCO2801 ,SN: xxxxxxxxx

或者其他什么样的格式。在这种情况下，我会通过寻找每行第一个':'来解析每一行，从中分离出列名，然后解析值（包括查找引号），接着继续处理这一行的其余部分。可以参考下面这样的代码（这段代码完全没有测试过）：

def parseline(line):
    result = {}
    while ':' in line:
        column, rest = line.split(':',1)
        column = column.strip()
        rest = rest.strip()
        if rest[0] in ('"', '"'): # It's quoted.
            quotechar = rest[0]
            end = rest.find(quotechar, 1) # Find the end of the quote
            value = rest[1:end]
            end = rest.find(',', end) # Find the next comma
        else: #Not quoted, just find the next comma:
            end = rest.find(',', 1) # Find the end of the value
            value = rest[0:end]
        result[column] = value
        line = rest[end+1:]
        line.strip()
    return result

回答于 2025-04-15 由 Python大师

分享举报

其实你得到的结果是对的——你的CSV语法是错的。

如果你想在CSV的值中引用逗号或其他字符，你需要把整个值用引号包起来，而不是只包裹其中的一部分。如果一个值没有以引号开头，Python的CSV处理方式就不会认为这个值是被引号包围的。

所以，不要使用

one",f",two,three

你应该使用

"one,f",two,three

回答于 2025-04-15 由 Python大师

分享举报

你可以让csv模块告诉你，只需要把你想要的输出传给写入器就可以了。

In [1]: import sys,csv

In [2]: csv.writer(sys.stdout).writerow(['one", f"', 'two', 'three'])  
"one"", f""",two,three

In [3]: csv.reader(['"one"", f""",two,three']).next()  
Out[3]: ['one", f"', 'two', 'three']

回答于 2025-04-15 由 Python大师

分享举报

Python CSV 帮助

3 个回答

撰写回答