str.replace（…）.replace（…）是Python中的标准习惯用法吗？

3条回答

网友

1楼 · 编辑于 2024-05-16 12:59:08

您是否有一个运行太慢的应用程序，并且您对它进行了分析，发现像这个代码段这样的行导致它运行太慢？瓶颈发生在意想不到的地方。

当前片段遍历字符串5次，每次只做一件事。你建议遍历一次，可能每次做五件事（或者至少每次做点什么）。目前还不清楚这会自动对我做得更好。目前使用的算法是O（nm）（假设字符串的长度大于规则中的内容），其中n是字符串的长度，m是替换规则的数目。我认为，你可以把算法复杂度降低到O（nlog（m））这样的值，在特定的情况下，我们的原始值都是一个字符（但不是在多次调用replace的情况下），但这并不重要，因为m是5但n是无限的。

如果m保持不变，那么这两个解的复杂性实际上都达到O（n）。我不清楚，把五个简单的关卡变成一个复杂的关卡是否是一项有价值的任务，而我目前还不能猜到这个关卡的实际时间。如果有什么东西可以使它的规模更好，我会认为这是更值得的任务。

在一次通行证而不是连续通行证上做每件事，也要求回答关于如何处理冲突规则以及如何应用这些规则的问题。对这些问题的解决是清楚的，有一个replace链。

网友

2楼 · 编辑于 2024-05-16 12:59:08

不如我们测试一下各种方法，看看哪个更快（假设我们只关心最快的方法）。

def escape1(input):
        return input.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;').replace("'", '&#39;').replace('"', '&quot;')

translation_table = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    "'": '&#39;',
    '"': '&quot;',
}

def escape2(input):
        return ''.join(translation_table.get(char, char) for char in input)

import re
_escape3_re = re.compile(r'[&<>\'"]')
def _escape3_repl(x):
    s = x.group(0)
    return translation_table.get(s, s)
def escape3(x):
    return _escape3_re.sub(_escape3_repl, x)

def escape4(x):
    return unicode(x).translate(translation_table)

test_strings = (
    'Nothing in there.',
    '<this is="not" a="tag" />',
    'Something & Something else',
    'This one is pretty long. ' * 50
)

import time

for test_i, test_string in enumerate(test_strings):
    print repr(test_string)
    for func in escape1, escape2, escape3, escape4:
        start_time = time.time()
        for i in xrange(1000):
            x = func(test_string)
        print '\t%s done in %.3fms' % (func.__name__, (time.time() - start_time))
    print

运行此命令可以：

'Nothing in there.'
    escape1 done in 0.002ms
    escape2 done in 0.009ms
    escape3 done in 0.001ms
    escape4 done in 0.005ms

'<this is="not" a="tag" />'
    escape1 done in 0.002ms
    escape2 done in 0.012ms
    escape3 done in 0.009ms
    escape4 done in 0.007ms

'Something & Something else'
    escape1 done in 0.002ms
    escape2 done in 0.012ms
    escape3 done in 0.003ms
    escape4 done in 0.007ms

'This one is pretty long. <snip>'
    escape1 done in 0.008ms
    escape2 done in 0.386ms
    escape3 done in 0.011ms
    escape4 done in 0.310ms

看起来一个接一个的更换速度最快。

编辑：对于前三个字符串（第四个字符串在我的计算机上花费的时间太长，我无法等待=p），再次以1000000次迭代运行测试会给出以下结果：

'Nothing in there.'
    escape1 done in 0.001ms
    escape2 done in 0.008ms
    escape3 done in 0.002ms
    escape4 done in 0.005ms

'<this is="not" a="tag" />'
    escape1 done in 0.002ms
    escape2 done in 0.011ms
    escape3 done in 0.009ms
    escape4 done in 0.007ms

'Something & Something else'
    escape1 done in 0.002ms
    escape2 done in 0.011ms
    escape3 done in 0.003ms
    escape4 done in 0.007ms

数字差不多一样。在第一种情况下，它们实际上更加一致，因为直接替换字符串现在是最快的。

网友

3楼 · 编辑于 2024-05-16 12:59:08

我喜欢干净的东西，比如：

substitutions = [
    ('<', '&lt;'),
    ('>', '&gt;'),
    ...]

for search, replacement in substitutions:
    string = string.replace(search, replacement)

相关问题更多 >

编程相关推荐

热门问题

热门文章