Python - 比较文件中的字符串
我正在学习Python,想在Inkscape中制作扩展,但在比较从文件中加载的字符串时遇到了一些问题。我想做的是加载一个我在文本文件中定义的多边形:
polygon
r:255
g:0
b:0
50;50
50;100
100;50
我的解析方法是这样的:
def load_file(filepath, parent, log):
file = open(filepath)
x = []
y = []
r = 0
g = 0
b = 0
index = 0
for line in file:
fline = line.lstrip("\xef\xbb\xbf").rstrip("\n")
log.write("Input string: " + repr(line) + "\n")
log.write("Formatted: " + repr(fline) + "\n")
if fline == "":
continue
elif fline is "polygon": ## Where the first line should be going
log.write("\tDetected string as polygon start delimiter\n")
if index > 0:
draw_shape(x, y, r, g, b, "Polygon", parent)
del x[0, len(x)]
del y[0, len(y)]
r = g = b = index = 0
continue
elif fline[:2] is "r:":
log.write("\tDetected string as polygon red value delimiter\n")
r = int(fline[2:])
continue
elif fline[:2] is "g:":
log.write("\tDetected string as polygon green value delimiter\n")
g = int(fline[2:])
continue
elif fline[:2] is "b:":
log.write("\tDetected string as polygon blue value delimiter\n")
b = int(fline[2:])
continue
else: ## Where the first line actually is going
log.write("\tDelimiter failed previous detections; assumed to be polygon cordinates\n")
spl = fline.split(";")
x[index] = float(spl[0]) ## Error gets thrown here
y[index] = float(spl[1])
index += 1
continue
draw_shape(x, y, r, g, b, parent)
这个方法在第一行出问题了。它一直看到“polygon”,然后就跳到了最后的else块,在那里解析坐标。我一直在记录的日志文件看起来是这样的:
Process Started
Input string: '\xef\xbb\xbfpolygon\n'
Formatted: 'polygon'
Delimiter failed previous detections; assumed to be polygon coordinates
我在shell中逐步检查这个过程,里面显示line is "process"
是对的,所以我完全搞不懂这里的问题。谁能帮帮我?
2 个回答
1
一旦你成功打开了Unicode文件,我觉得下面这种方式比你现在的做法要简单一些:
elements='''polygon
r:255
g:0
b:0
50;50
50;100
100;50
polygon
r:155
g:22
b:55
55;60
66;100
120;150
155;167'''
for element in re.split(r'^\s*\n',elements,flags=re.MULTILINE):
if element.startswith('polygon'):
el=element.splitlines()
poly={k:v for k,v in [s.split(':') for s in el[1:4]]}
x,y=zip(*[s.split(';') for s in el[4:]])
poly.update({'x':x, 'y': y})
print poly
输出结果:
{'y': ('50', '100', '50'), 'x': ('50', '50', '100'), 'r': '255', 'b': '0', 'g': '0'}
{'y': ('60', '100', '150', '167'), 'x': ('55', '66', '120', '155'), 'r': '155', 'b': '55', 'g': '22'}
1
比较
fline is "polygon"
几乎总是会返回假(false)。建议使用fline == "polygon"
来进行比较。这和你的问题没有直接关系,但如果你使用正确的Unicode解码函数来处理文本,会更简单,而不是手动去掉字节顺序标记(byte order mark)并把其他部分当作字节来处理。我个人推荐使用
codecs.open(filename, encoding='utf-8-sig')
。