Python - 比较文件中的字符串

1 投票
2 回答
1535 浏览
提问于 2025-04-17 15:39

我正在学习Python,想在Inkscape中制作扩展,但在比较从文件中加载的字符串时遇到了一些问题。我想做的是加载一个我在文本文件中定义的多边形:

    polygon
    r:255
    g:0
    b:0
    50;50
    50;100
    100;50

我的解析方法是这样的:

    def load_file(filepath, parent, log):
        file = open(filepath)
        x = []
        y = []
        r = 0
        g = 0
        b = 0
        index = 0

        for line in file:
            fline = line.lstrip("\xef\xbb\xbf").rstrip("\n")
            log.write("Input string: " + repr(line) + "\n")
            log.write("Formatted: " + repr(fline) + "\n")
            if fline == "":
                continue
            elif fline is "polygon": ## Where the first line should be going
                log.write("\tDetected string as polygon start delimiter\n")
                if index > 0:
                    draw_shape(x, y, r, g, b, "Polygon", parent)
                    del x[0, len(x)]
                    del y[0, len(y)]
                    r = g = b = index = 0
                continue
            elif fline[:2] is "r:":
                log.write("\tDetected string as polygon red value delimiter\n")
                r = int(fline[2:])
                continue
            elif fline[:2] is "g:":
                log.write("\tDetected string as polygon green value delimiter\n")
                g = int(fline[2:])
                continue
            elif fline[:2] is "b:":
                log.write("\tDetected string as polygon blue value delimiter\n")
                b = int(fline[2:])
                continue
            else: ## Where the first line actually is going
                log.write("\tDelimiter failed previous detections; assumed to be polygon cordinates\n")
                spl = fline.split(";")
                x[index] = float(spl[0]) ## Error gets thrown here
                y[index] = float(spl[1])
                index += 1
                continue

        draw_shape(x, y, r, g, b, parent)

这个方法在第一行出问题了。它一直看到“polygon”,然后就跳到了最后的else块,在那里解析坐标。我一直在记录的日志文件看起来是这样的:

    Process Started
    Input string: '\xef\xbb\xbfpolygon\n'
    Formatted: 'polygon'
        Delimiter failed previous detections; assumed to be polygon coordinates

我在shell中逐步检查这个过程,里面显示line is "process"是对的,所以我完全搞不懂这里的问题。谁能帮帮我?

2 个回答

1

一旦你成功打开了Unicode文件,我觉得下面这种方式比你现在的做法要简单一些:

elements='''polygon
r:255
g:0
b:0
50;50
50;100
100;50

polygon
r:155
g:22
b:55
55;60
66;100
120;150
155;167'''       

for element in re.split(r'^\s*\n',elements,flags=re.MULTILINE):
    if element.startswith('polygon'):
        el=element.splitlines()
        poly={k:v for k,v in [s.split(':') for s in el[1:4]]}
        x,y=zip(*[s.split(';') for s in el[4:]])
        poly.update({'x':x, 'y': y})
        print poly

输出结果:

{'y': ('50', '100', '50'), 'x': ('50', '50', '100'), 'r': '255', 'b': '0', 'g': '0'}
{'y': ('60', '100', '150', '167'), 'x': ('55', '66', '120', '155'), 'r': '155', 'b': '55', 'g': '22'}
1
  1. 比较 fline is "polygon" 几乎总是会返回假(false)。建议使用 fline == "polygon" 来进行比较。

  2. 这和你的问题没有直接关系,但如果你使用正确的Unicode解码函数来处理文本,会更简单,而不是手动去掉字节顺序标记(byte order mark)并把其他部分当作字节来处理。我个人推荐使用 codecs.open(filename, encoding='utf-8-sig')

撰写回答