# Drawing the Same Item More Than Once is Surprisingly Likely

When drawing items from a pool with replacement, the probability that you’ll draw the same item more than once is much higher than intuition would suggest.

Often called the Birthday Paradox and framed thus: in a group of just 23 people, there is a greater than 50% chance that at least two people will share a birthday.11 Assuming birthdays are evenly distributed, which they’re not.

1 Assuming birthdays are evenly distributed, which they’re not.

## Math

Let $k$ be the number of items we draw with replacement from the pool with $n$ items. $P(n, k)$ is the probability that at least one item is drawn more than once.

$P(n, k)=\begin{cases} 1 - \frac{_{n}P_{k}}{n^{k}} &k \leq n \\ 1 &k \gt n \end{cases}$

In maxima:

birthday(n, k) := 1 - ( k! * binomial(n, k) / n ^ k);


### Approximations

The numbers get unweildy if we’re trying to work out collision probabilities for, say, a random identifier with 80 bits. We can approximate $P(n,k)$ with:

$P(n, k) \approx 1 - e^{ \frac{ -n^2 } { 2d } }$
birthday_approx(n, k) := 1 - %e ^ (k ^ 2 / (2 * n));


To approximate $k(p,n)$, how many items need to be drawn so that the probability of a collision is $p$:

$k(p, n) \approx \sqrt{ 2n \ln{ \frac{ 1 }{ 1 - p } } }$
collision_k(p, n) := sqrt( 2 * n * log( 1 / ( 1 - p) ) );