在python中使Eratosthenes的筛选更高效？

Eratosthenes内存约束问题的筛选

我目前正试图为一个Kattis问题实现一个版本的eratosthenes筛，但是，我遇到了一些我的实现无法通过的内存限制

这里是问题statement的链接。简言之，问题要求我首先返回小于或等于n的素数，然后解决一定数量的查询，如果一个数i是否为素数。有一个50 MB内存使用的限制，以及仅使用python的标准库（无numpy等）。内存限制就是我被卡住的地方

以下是我目前的代码：

import sys def sieve_of_eratosthenes(xs, n): count = len(xs) + 1 p = 3 # start at three index = 0 while p*p < n: for i in range(index + p, len(xs), p): if xs[i]: xs[i] = 0 count -= 1 temp_index = index for i in range(index + 1, len(xs)): if xs[i]: p = xs[i] temp_index += 1 break temp_index += 1 index = temp_index return count def isPrime(xs, a): if a == 1: return False if a == 2: return True if not (a & 1): return False return bool(xs[(a >> 1) - 1]) def main(): n, q = map(int, sys.stdin.readline().split(' ')) odds = [num for num in range(2, n+1) if (num & 1)] print(sieve_of_eratosthenes(odds, n)) for _ in range(q): query = int(input()) if isPrime(odds, query): print('1') else: print('0') if __name__ == "__main__": main()

到目前为止，我已经做了一些改进，比如只保留一个所有奇数的列表，这将使内存使用减半。我还确信，在计算素数时，代码会按预期工作（不会得到错误的答案）。我现在的问题是，如何使我的代码更高效地使用内存？我应该使用其他数据结构吗？用布尔值替换我的整数列表？位数组

任何建议都将不胜感激

编辑

在对python中的代码进行了一些调整之后，我遇到了一个难题，我的分段筛选实现无法满足内存需求

相反，我选择用Java实现解决方案，这几乎不费什么力气。代码如下：

public int sieveOfEratosthenes(int n){ sieve = new BitSet((n+1) / 2); int count = (n + 1) / 2; for (int i=3; i*i <= n; i += 2){ if (isComposite(i)) { continue; } // Increment by two, skipping all even numbers for (int c = i * i; c <= n; c += 2 * i){ if(!isComposite(c)){ setComposite(c); count--; } } } return count; } public boolean isComposite(int k) { return sieve.get((k - 3) / 2); // Since we don't keep track of even numbers } public void setComposite(int k) { sieve.set((k - 3) / 2); // Since we don't keep track of even numbers } public boolean isPrime(int a) { if (a < 3) return a > 1; if (a == 2) return true; if ((a & 1) == 1) return !isComposite(a); else return false; } public void run() throws Exception{ BufferedReader scan = new BufferedReader(new InputStreamReader(System.in)); String[] line = scan.readLine().split(" "); int n = Integer.parseInt(line[0]); int q = Integer.parseInt(line[1]); System.out.println(sieveOfEratosthenes(n)); for (int i=0; i < q; i++){ line = scan.readLine().split(" "); System.out.println( isPrime(Integer.parseInt(line[0])) ? '1' : '0'); } }

我个人还没有找到用Python实现此位集解决方案的方法（仅使用标准库）

如果有人在python中偶然发现了一个解决这个问题的简洁实现，使用分段筛、位数组或其他方法，我很想看看解决方案

3条回答

网友

1楼 · 编辑于 2024-05-19 01:14:30

我学到了一个技巧just yesterday——如果你把数字分成6组，6组中只有2个可能是素数。其他的可以被2或3平分。这意味着跟踪6个数字的素性只需要2位；一个包含8位的字节可以跟踪24个数字的素数！这大大减少了筛的内存需求

在Windows10上的Python3.7.5 64位中，以下代码没有超过36.4MB

remainder_bit = [0, 0x01, 0, 0, 0, 0x02,
                 0, 0x04, 0, 0, 0, 0x08,
                 0, 0x10, 0, 0, 0, 0x20,
                 0, 0x40, 0, 0, 0, 0x80]

def is_prime(xs, a):
    if a <= 3:
        return a > 1
    index, rem = divmod(a, 24)
    bit = remainder_bit[rem]
    if not bit:
        return False
    return not (xs[index] & bit)

def sieve_of_eratosthenes(xs, n):
    count = (n // 3) + 1 # subtract out 1 and 4, add 2 3 and 5
    p = 5
    while p*p <= n:
        if is_prime(xs, p):
            for i in range(5 * p, n + 1, p):
                index, rem = divmod(i, 24)
                bit = remainder_bit[rem]
                if bit and not (xs[index] & bit):
                    xs[index] |= bit
                    count -= 1
        p += 2
        if is_prime(xs, p):
            for i in range(5 * p, n + 1, p):
                index, rem = divmod(i, 24)
                bit = remainder_bit[rem]
                if bit and not (xs[index] & bit):
                    xs[index] |= bit
                    count -= 1
        p += 4

    return count


def init_sieve(n):
    return bytearray((n + 23) // 24)

n = 100000000
xs = init_sieve(n)
sieve_of_eratosthenes(xs, n)
5761455
sum(is_prime(xs, i) for i in range(n+1))
5761455

编辑：了解其工作原理的关键是，筛子会创建重复模式。对于素数2和3，模式每2*3或6个数字重复一次，在这6个数字中，4个已经被渲染为不可能是素数，只剩下2个。没有什么限制你选择素数来生成模式，也许除了收益递减定律。我决定尝试在混音中加入5，使图案每2*3*5=30个数字重复一次。在这30个数字中，只有8个可以是素数，这意味着每个字节可以跟踪30个数字，而不是上面的24个！这使您在内存使用方面有20%的优势。

这是更新后的代码。我还简化了一点，去掉了素数的计数

remainder_bit30 = [0,    0x01, 0,    0,    0,    0,    0, 0x02, 0,    0,
                   0,    0x04, 0,    0x08, 0,    0,    0, 0x10, 0,    0x20,
                   0,    0,    0,    0x40, 0,    0,    0, 0,    0,    0x80]

def is_prime(xs, a):
    if a <= 5:
        return (a > 1) and (a != 4)
    index, rem = divmod(a, 30)
    bit = remainder_bit30[rem]
    return (bit != 0) and not (xs[index] & bit)

def sieve_of_eratosthenes(xs):
    n = 30 * len(xs) - 1
    p = 0
    while p*p < n:
        for offset in (1, 7, 11, 13, 17, 19, 23, 29):
            p += offset
            if is_prime(xs, p):
                for i in range(p * p, n + 1, p):
                    index, rem = divmod(i, 30)
                    if index < len(xs):
                        bit = remainder_bit30[rem]
                        xs[index] |= bit
            p -= offset
        p += 30

def init_sieve(n):
    b = bytearray((n + 30) // 30)
    return b

网友

2楼 · 编辑于 2024-05-19 01:14:30

这的确是一个非常具有挑战性的问题。最大可能的N为10^8，假设没有任何开销，每个值使用一个字节将导致几乎100 MB的数据。即使只存储奇数而将数据减半，在考虑开销后，也会使您的数据接近50MB

这意味着解决方案必须使用以下几种策略中的一种或多种：

为素性标志数组使用更有效的数据类型。Python列表维护指向每个列表项的指针数组（64位Python上每个指针有4个字节）。我们实际上需要原始二进制存储，这在标准python中几乎只剩下bytearray
在筛选中每个值只使用一位，而不是整个字节（Bool技术上只需要一位，但通常使用完整字节）
细分以删除偶数，也可能是3、5、7等的倍数。
使用segmented sieve

我最初试图通过在筛选中只存储每个值1位来解决这个问题，虽然内存使用率确实在要求范围内，但Python缓慢的位操作将执行时间推得太长。同时，很难找出复杂的索引以确保可靠地计算正确的位

然后，我使用bytearray实现了仅奇数的解决方案，虽然速度快了很多，但内存仍然是个问题

Bytearray奇数实现：

class Sieve:
    def __init__(self, n):
        self.not_prime = bytearray(n+1)
        self.not_prime[0] = self.not_prime[1] = 1
        for i in range(2, int(n**.5)+1):
            if self.not_prime[i] == 0:
                self.not_prime[i*i::i] = [1]*len(self.not_prime[i*i::i])
        self.n_prime = n + 1 - sum(self.not_prime)
        
    def is_prime(self, n):
        return int(not self.not_prime[n])
        


def main():
    n, q = map(int, input().split())
    s = Sieve(n)
    print(s.n_prime)
    for _ in range(q):
        i = int(input())
        print(s.is_prime(i))

if __name__ == "__main__":
    main()

由此进一步减少内存，应能使其正常工作

编辑： 此外，删除2和3的倍数似乎不足以减少内存，尽管^{}似乎表明我的使用量实际上略低于50MB。🤷‍♂️

网友

3楼 · 编辑于 2024-05-19 01:14:30

我认为您可以尝试使用布尔值列表来标记其索引是否为素数：

def sieve_of_erato(range_max):
    primes_count = range_max
    is_prime = [True for i in range(range_max + 1)]
    # Cross out all even numbers first.
    for i in range(4, range_max, 2):
        is_prime[i] = False
        primes_count -=1
    i = 3
    while i * i <= range_max:
        if is_prime[i]:
            # Update all multiples of this prime number
            # CAREFUL: Take note of the range args.
            # Reason for i += 2*i instead of i += i:
            # Since p and p*p, both are odd, (p*p + p) will be even,
            # which means that it would have already been marked before
            for multiple in range(i * i, range_max + 1, i * 2):
                is_prime[multiple] = False
                primes_count -= 1
        i += 1

    return primes_count


def main():
    num_primes = sieve_of_erato(100)
    print(num_primes)


if __name__ == "__main__":
    main()

稍后，您可以使用is_prime数组通过简单地检查is_prime[number] == True来检查一个数字是否为素数

如果这不起作用，请尝试segmented sieve

另外，您可能会惊讶地发现，有一种方法可以在O(n)而不是O(nloglogn)中生成筛选。检查代码here

Eratosthenes内存约束问题的筛选

编辑

相关问题更多 >

编程相关推荐

热门问题

热门文章