在Python3中快速输入一行所有数字的方法？

2 投票

1 回答

1154 浏览

数据工程师

提问于 2025-04-18 17:39

我的问题是，在SPOJ上遇到了“超出时间限制”的错误。我原以为这个错误是因为输入数据太大，所以我需要你的帮助来找到处理大输入的最佳方法。

让我先解释一下输入格式和我的代码。

输入

输入的第一行包含测试用例的数量t（1<=t<=100）。接下来有2*t行，每个测试用例有2行。每个测试用例的第一行包含一个数字n（0<=n<=10^6），接着在下一行有n个元素。每个数字的范围是从-10^3到+10^3。

这里有一个输入的例子

3 #number of test cases (t)

4 #number of elements that will come to next line (n) (this can be 10^6)

2 1 2 2 #this line may have 10^6 numbers

6 

1 1 1 2 2 2

5

1 2 4 5 1

问题是要判断一个数字是否出现超过n//2次。

示例输出

YES 2  #because 2 appears 3 times.

NO     # no number occurs more than n//2 times

NO     # no number occurs more than n//2 times.

关于这个问题的更多信息

添加者：Troika::Bytes 日期：2010-02-18 时间限制：1秒源文件限制：50000B 内存限制：256MB 集群：Pyramid（Intel Pentium III 733 MHz）支持的语言：除了PERL 6以外的所有语言

http://www.spoj.com/problems/MAJOR/

最后是我的代码。

from collections import Counter
import re

def major(n):
    s = input().split() # get all the numbers one by one
    y = (Counter(s)).most_common()[0]  # Count how many times a number occurs in the input and return the most occurence with its value in a pair.
    if y[1]> n//2: #if the most occurence is more than half of the "n"
        return "YES " + y[0] # return the value belongs to the most occurence
    else:
        return "NO"

for i in range(int(input())): #get the number of test cases
    print(major(int(input()))) # calling the function to get the "n"

我把输入部分改成了s = re.findall(r'\d+', input())，因为我在想s = input().split()对于像一百万个数字这样的情况会不会太慢。但我还是收到了“超出时间限制”的错误。这可能和Counter函数有关吗？

大数据处理测试用例时间复杂度算法效率数字统计内存限制 spoj 输入优化

1 个回答

你可以通过以下几种方式来优化它：

关闭垃圾回收：
```
import gc
gc.disable()
```
避免使用 string.split 和 re.finditer。
不要用 range，而是使用 while 循环。
用 sys.stdin.readline() 替代 input（这样会快很多），用 sys.stdout.write 替代 print。
当某个数字出现的次数超过 n // 2 时就停止（而且，这个计算只需要做一次）。你可以使用 collections.defaultdict，在更新之前先检查出现的次数。
尽量避免使用函数，把所有代码放在一个循环里。
有时候在 main 函数里导入库可以节省时间。

不幸的是，正如 Martijn Pieters 所提到的，目前没有适用于 Python 3.x 的公认解决方案，只有一个适用于 Python 2.x 的解决方案。根据解决这个问题所消耗的内存来看，numerix 本可以使用 psyco（这是 PyPy 的基础库，比 CPython 快得多）。不幸的是，由于 psyco 已经停止维护，它在执行这个问题的集群上也不再支持，所以可能没有人能提交通过的解决方案，直到他们重新启用 psyco 或 PyPy 的支持。

示例

import gc
gc.disable()

# import psyco
# psyco.full()

def main():
    from collections import defaultdict
    from sys import stdin, stdout
    import re

    pattern = re.compile(r'\d+')
    times = int(stdin.readline())

    while times:
        times -= 1
        threshold = int(stdin.readline()) // 2
        vals = defaultdict(int)
        for x in pattern.finditer(stdin.readline()):
            n = int(x.group(0))
            rep = vals[n] + 1
            if rep > threshold:
                stdout.write('YES ' + str(n) + '\n')
                break
            else:
                vals[n] = rep
        else:
            stdout.write('NO\n')

if __name__ == '__main__':
    main()

编辑

我给 numerix 发了封邮件，他确认他的解决方案使用了 psyco：

是的，我的 MAJOR 解决方案使用了 psyco。我没有尝试过不使用 psyco 的情况，但由于没有其他通过的 Python 解决方案，我想这可能不行，或者需要进行大量的输入输出优化。

回答于 2025-04-18 由 Python大师

分享举报

在Python3中快速输入一行所有数字的方法？

1 个回答

示例

撰写回答