精简一系列tryexcept+iff语句以加快Python中的处理速度

import re import glob fileCounter = 0 for infile in glob.iglob(r'\input-files\*.txt'): fileCounter += 1 outfile = r'\output-files\output_%s.txt' % fileCounter with open(infile, "rb") as inList, open(outfile, "wb") as outlist: for inline in inlist: inword = inline.strip('\r\n') #apply some text transformations #Transformation #1 try: result = re.match('^[AEIOUYaeiouy]([bcćdfghjklłmnńprsśtwzżź]|rz|sz|cz|dz|dż|dź|ch)[aąeęioóuy](.*\[=\].*)*', inword).group() except: result = None if result == inword: inword = re.sub('(?<=^[AEIOUYaeiouy])(?=([bcćdfghjklłmnńprsśtwzżź]|rz|sz|cz|dz|dż|dź|ch)[aąeęioóuy])', '[=]', wbWord) #Transformation #2 etc. try: result = re.match('(.*\[=\].*)*(\w?\w?)[AEIOUYaąeęioóuy]\[=\][ćsśz][ptkbdg][aąeęioóuyrfw](.*\[=\].*)*', inword).group() except: result = None if result == inword: inword = re.sub('(?<=[AEIOUYaąeęioóuy])\[=\](?=[ćsśz][ptkbdg][aąeęioóuyrfw])', '', inword) inword = re.sub('(?<=[AEIOUYaąeęioóuy][ćsśz])(?=[ptkbdg][aąeęioóuyrfw])', '[=]', inword) outline = inword + "\n" outlist.write(outline) print "Processed file number %s" % fileCounter print "*** Processing completed ***"

1条回答

网友

1楼 · 发布于 2024-04-24 14:55:16

try/except确实不是测试re.match()结果的最有效方法（也不是最可读的方法），但是惩罚命中率应该是（或多或少）恒定的——在执行过程中性能不应该降低（直到可能由于数据而出现最坏的情况——但是好吧）——所以问题可能出在其他地方。你知道吗

FWIW您可以先用适当的规范解决方案替换try/except块，即代替：

try:
    result = re.match(someexp, yourline).group()
except:
    result = None

你想要：

match = re.match(someexp, yourline)
result = match.group() if match else None

这将略微提高性能，但最重要的是，使代码更具可读性和可维护性—至少不会隐藏任何意外的错误。你知道吗

另请注意，从不使用bare except子句，总是只捕获预期的异常（这里它应该是AttributeError，因为re.match()在没有匹配的情况下返回None，None当然没有属性group）。你知道吗

这很可能解决不了你的问题，但至少你会知道问题在别处。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章