如何在不使用try/except的情况下检查字符串是否表示整数？

138

我发现，try/except这个东西在性能上表现得并不好，不管是什么原因。我经常尝试几种不同的方法，但我觉得用try/except的方式通常效果最差，甚至可能是最差的。虽然不是每次都是这样，但在很多情况下都是。我知道很多人说这是“Pythonic”的方式，但在这一点上我和他们有不同的看法。对我来说，这种方式既不高效也不优雅，所以我一般只用它来处理错误和报告错误。

我本来想抱怨一下，PHP、Perl、Ruby、C，甚至连命令行都有简单的函数来检查一个字符串是否是整数，但我在验证这些假设时却遇到了麻烦！显然，这种缺失是个普遍的问题。

这是对Bruno帖子的一些快速修改：

import sys, time, re

g_intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$")

testvals = [
    # integers
    0, 1, -1, 1.0, -1.0,
    '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', '06',
    # non-integers
    'abc 123',
    1.1, -1.1, '1.1', '-1.1', '+1.1',
    '1.1.1', '1.1.0', '1.0.1', '1.0.0',
    '1.0.', '1..0', '1..',
    '0.0.', '0..0', '0..',
    'one', object(), (1,2,3), [1,2,3], {'one':'two'},
    # with spaces
    ' 0 ', ' 0.', ' .0','.01 '
]

def isInt_try(v):
    try:     i = int(v)
    except:  return False
    return True

def isInt_str(v):
    v = str(v).strip()
    return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

def isInt_re(v):
    import re
    if not hasattr(isInt_re, 'intRegex'):
        isInt_re.intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$")
    return isInt_re.intRegex.match(str(v).strip()) is not None

def isInt_re2(v):
    return g_intRegex.match(str(v).strip()) is not None

def check_int(s):
    s = str(s)
    if s[0] in ('-', '+'):
        return s[1:].isdigit()
    return s.isdigit()    


def timeFunc(func, times):
    t1 = time.time()
    for n in range(times):
        for v in testvals: 
            r = func(v)
    t2 = time.time()
    return t2 - t1

def testFuncs(funcs):
    for func in funcs:
        sys.stdout.write( "\t%s\t|" % func.__name__)
    print()
    for v in testvals:
        if type(v) == type(''):
            sys.stdout.write("'%s'" % v)
        else:
            sys.stdout.write("%s" % str(v))
        for func in funcs:
            sys.stdout.write( "\t\t%s\t|" % func(v))
        sys.stdout.write("\r\n") 

if __name__ == '__main__':
    print()
    print("tests..")
    testFuncs((isInt_try, isInt_str, isInt_re, isInt_re2, check_int))
    print()

    print("timings..")
    print("isInt_try:   %6.4f" % timeFunc(isInt_try, 10000))
    print("isInt_str:   %6.4f" % timeFunc(isInt_str, 10000)) 
    print("isInt_re:    %6.4f" % timeFunc(isInt_re, 10000))
    print("isInt_re2:   %6.4f" % timeFunc(isInt_re2, 10000))
    print("check_int:   %6.4f" % timeFunc(check_int, 10000))

这是性能比较的结果：

timings..
isInt_try:   0.6426
isInt_str:   0.7382
isInt_re:    1.1156
isInt_re2:   0.5344
check_int:   0.3452

用C语言的方法可以一次性扫描字符串，然后就完成了。我认为，用C语言一次性扫描字符串是正确的做法。

编辑：

我更新了上面的代码，使其可以在Python 3.5中运行，并加入了当前最受欢迎的check_int函数，还使用了我能找到的最流行的正则表达式来测试是否为整数。这个正则表达式会拒绝像'abc 123'这样的字符串。我还把'abc 123'作为测试值添加了进来。

我觉得很有趣的是，所有测试的函数，包括try方法、流行的check_int函数，以及最流行的测试整数的正则表达式，都没有对所有测试值返回正确的答案（当然，这也要看你认为正确答案是什么；请看下面的测试结果）。

内置的int()函数会默默地截断浮点数的小数部分，只返回小数点前的整数部分，除非先将浮点数转换为字符串。

check_int()函数对像0.0和1.0这样的值返回false（从技术上讲，它们是整数），而对像'06'这样的值返回true。

这是当前（Python 3.5）的测试结果：

              isInt_try |       isInt_str       |       isInt_re        |       isInt_re2       |   check_int   |
0               True    |               True    |               True    |               True    |       True    |
1               True    |               True    |               True    |               True    |       True    |
-1              True    |               True    |               True    |               True    |       True    |
1.0             True    |               True    |               False   |               False   |       False   |
-1.0            True    |               True    |               False   |               False   |       False   |
'0'             True    |               True    |               True    |               True    |       True    |
'0.'            False   |               True    |               False   |               False   |       False   |
'0.0'           False   |               True    |               False   |               False   |       False   |
'1'             True    |               True    |               True    |               True    |       True    |
'-1'            True    |               True    |               True    |               True    |       True    |
'+1'            True    |               True    |               True    |               True    |       True    |
'1.0'           False   |               True    |               False   |               False   |       False   |
'-1.0'          False   |               True    |               False   |               False   |       False   |
'+1.0'          False   |               True    |               False   |               False   |       False   |
'06'            True    |               True    |               False   |               False   |       True    |
'abc 123'       False   |               False   |               False   |               False   |       False   |
1.1             True    |               False   |               False   |               False   |       False   |
-1.1            True    |               False   |               False   |               False   |       False   |
'1.1'           False   |               False   |               False   |               False   |       False   |
'-1.1'          False   |               False   |               False   |               False   |       False   |
'+1.1'          False   |               False   |               False   |               False   |       False   |
'1.1.1'         False   |               False   |               False   |               False   |       False   |
'1.1.0'         False   |               False   |               False   |               False   |       False   |
'1.0.1'         False   |               False   |               False   |               False   |       False   |
'1.0.0'         False   |               False   |               False   |               False   |       False   |
'1.0.'          False   |               False   |               False   |               False   |       False   |
'1..0'          False   |               False   |               False   |               False   |       False   |
'1..'           False   |               False   |               False   |               False   |       False   |
'0.0.'          False   |               False   |               False   |               False   |       False   |
'0..0'          False   |               False   |               False   |               False   |       False   |
'0..'           False   |               False   |               False   |               False   |       False   |
'one'           False   |               False   |               False   |               False   |       False   |
<obj..>         False   |               False   |               False   |               False   |       False   |
(1, 2, 3)       False   |               False   |               False   |               False   |       False   |
[1, 2, 3]       False   |               False   |               False   |               False   |       False   |
{'one': 'two'}  False   |               False   |               False   |               False   |       False   |
' 0 '           True    |               True    |               True    |               True    |       False   |
' 0.'           False   |               True    |               False   |               False   |       False   |
' .0'           False   |               False   |               False   |               False   |       False   |
'.01 '          False   |               False   |               False   |               False   |       False   |

刚才我尝试添加了这个函数：

def isInt_float(s):
    try:
        return float(str(s)).is_integer()
    except:
        return False

它的性能几乎和check_int一样（0.3486），并且对像1.0、0.0、+1.0、0.和.0这样的值都返回true。但它也会对'06'返回true，所以，你可以自己选择了。

回答于 2025-04-15 由 Python大师

分享举报

1115

对于正整数，你可以使用 .isdigit 方法：

>>> '16'.isdigit()
True

不过，这个方法对负整数就不管用了。假设你可以尝试以下方法：

>>> s = '-17'
>>> s.startswith('-') and s[1:].isdigit()
True

它也不适用于 '16.0' 这种格式，这在某种意义上和将数据转换为 int 是类似的。

编辑：

def check_int(s):
    if s[0] in ('-', '+'):
        return s[1:].isdigit()
    return s.isdigit()

回答于 2025-04-15 由 Python大师

分享举报

506

如果你真的觉得到处都用try/except很烦，那不如写一个辅助函数来解决这个问题：

def represents_int(s):
    try: 
        int(s)
    except ValueError:
        return False
    else:
        return True

>>> print(represents_int("+123"))
True
>>> print(represents_int("10.0"))
False

要完全覆盖所有Python认为是整数的字符串，代码会变得非常复杂。我建议在这方面保持Python的风格。

回答于 2025-04-15 由 Python大师

分享举报

如何在不使用try/except的情况下检查字符串是否表示整数？

23 个回答

撰写回答