碰撞所需的样本数量有多少(生日悖论)

2024-05-19 00:22:23 发布

您现在位置:Python中文网/ 问答频道 /正文

只是想理解生日悖论。 使用下面的代码,我发现我平均需要12个样本来获得生日碰撞。
不明白为什么23个人生日相撞的几率比正常人低很多。 即使使用PyCrypto中的StrongRandom,结果也不会改变。在

from random import randint
from Crypto.Random.random import StrongRandom
EXPERIMENTS_NUM = 10000
SET_SIZE = 365
SUBSET = 23

where_collision_found = list()
rnd = StrongRandom()
for experiment in range(EXPERIMENTS_NUM):
  for i in range(1,SET_SIZE + 2):
    collision_found = False
    #Generate a subset
    # subset = [rnd.randint(1, SET_SIZE) for x in range(i)]
    subset = [randint(1, SET_SIZE) for x in range(i)]
    # Check for collision
    flags = [False for x in range(SET_SIZE + 1)]
    for k in range(i):
      if flags[subset[k]]: #Collision found
        collision_found = True
      else:
        flags[subset[k]] = True

    if collision_found:
      # print 'Collision found in set:', subset
      break
  where_collision_found.append(i)
print 'average collision:', sum(where_collision_found) / float(len(where_collision_found)), 'in', EXPERIMENTS_NUM, 'experiments'

结果:
average collision: 12.1277 in 10000 experiments


Tags: infromforsizerangewherenumflags
1条回答
网友
1楼 · 发布于 2024-05-19 00:22:23

我不太清楚你的代码在做什么。以下是我刚才所做的:

from random import randrange
N = 365
ns = []
for _ in range(10000):
    n = 0
    seen = set()
    while True:
        b = randrange(N)
        n += 1
        if b in seen:
            break
        seen.add(b)
    ns.append(n)
print(sum(ns) / float(len(ns)))

以及样本运行的输出:

^{pr2}$

这很好。您期望的“23”是分布的中位数;平均值(平均值)预计为24.61659。。。请看这里: https://en.wikipedia.org/wiki/Birthday_problem#Average_number_of_people

相关问题 更多 >

    热门问题