Skip to content

5.4 Probabilistic analysis and further uses of indicator random variables

5.4.-1

How many people must there be in a room before the probability that someone has the same birthday as you do is at least $1/2$? How many people must there be before the probability that at least two people have a birthday on July 4 is greater than $1/2$?

For the first question

$$ \begin{aligned} & 1-(\frac{n-1}{n})^k\geq \frac{1}{2}\cr \implies & (\frac{n-1}{n})^k\leq \frac{1}{2}\cr \implies & k\lg(\frac{n-1}{n})\leq -1\cr \implies & k\geq \frac{-1}{\lg(364/365)}\approx 252.652\cr \implies & k\geq 253\cr \end{aligned} $$

For the seacond question

$$ \begin{aligned} P( X \geq 2 ) & = 1 - P( X = 1 ) - P( X = 0 )\cr & = 1 - \frac{k}{n}(\frac{n-1}{n})^{k-1} - (\frac{n-1}{n})^k\cr & = 1 - (\frac{n+k-1}{n})(\frac{n-1}{n})^{k-1}\cr & \geq \frac{1}{2}\cr \implies & \frac{n+k-1}{n}(\frac{n-1}{n})^{k-1} \leq \frac{1}{2}\cr \implies & \lg(\frac{n+k-1}{n}) + (k-1)\lg(\frac{n-1}{n}) \leq -1\cr \end{aligned} $$

use the c++ program below, we can find the smallest $k$ is 613.

#include <iostream>
#include <cmath> // For std::log

int findSmallestK(int n) {
    int k = 1;
    double log_half = std::log2(0.5);
    double log_n_minus_1_over_n = std::log2((n - 1) / static_cast<double>(n));

    while (true) {
        double lhs = std::log2((n + k - 1) / static_cast<double>(n)) + (k - 1) * log_n_minus_1_over_n;
        if (lhs <= log_half) {
            return k;
        }
        ++k;
    }
}

int main() {
    int n;
    std::cout << "Enter n: ";
    std::cin >> n;

    int k = findSmallestK(n);
    std::cout << "The smallest k is: " << k << std::endl;

    return 0;
}

5.4-2

How many people must there be in a room before the probability that two people have the same birthday is at least $0.99$? For that many people, what is the expected number of pairs of people who have the same birthday?

For the first question:

$$ \begin{aligned} 1 - \frac{P(n,k)}{n^k}\geq 0.99\cr 1 - \frac{n!}{n^{k}(n-k)!}\geq 0.99\cr \implies \frac{n!}{n^{k}(n-k)!} \leq 0.01\cr \end{aligned} $$

use the c++ program below, we can find the smallest $k$ is 57.

#include <iostream>
#include <cmath> // For std::log

int findSmallestK(int n)
{
    // Precompute log2(0.01) for comparison
    double log_threshold = std::log2(0.01);


    int k = 1;
    // Compute log2(n!/(n-k)!)
    double log_n_minus_k_factorial = 0;
    // compute log2(n^n) = n * log2(n)
    double log_n_power_n = 0;

    while (k <= n)
    {
        log_n_minus_k_factorial += std::log2(n-k+1);
        log_n_power_n  += std::log2(n);
        // Calculate the left-hand side of the inequality
        double lhs = log_n_minus_k_factorial - log_n_power_n;

        // Check if the inequality is satisfied
        if (lhs <= log_threshold)
        {
            return k;
        }
        ++k;
    }

    // If no such k is found, return -1 or an appropriate error value
    return -1;
}

int main()
{
    int n;
    std::cout << "Enter n: ";
    std::cin >> n;

    int k = findSmallestK(n);
    if (k != -1)
    {
        std::cout << "The smallest k is: " << k << std::endl;
    }
    else
    {
        std::cout << "No valid k found for the given n." << std::endl;
    }

    return 0;
}

For the second question

Each two person has probability of $1/365$ to be a pair. There are $57\choose 2$ combination.

$$ \begin{aligned} E[pairs] & = \frac{57\choose 2}{365}\cr & = \frac{5756}{3652}\cr & \approx 4.37\cr \end{aligned} $$

Or you can use indicator random variables,although I do not like it.

$$ \begin{aligned} X_{i,j} & = \begin{cases} 1 & \text{i, j has same birthday}\cr 0 & \text{else}\cr \end{cases}\cr E[pairs] & = E\left[\sum_{1\leq i \leq j \leq 57}X_{i,j} \right]\cr & = \sum_{1\leq i \leq j \leq 57} E[X_{i,j}]\cr & = {57 \choose 1}\cdot 1 \cdot \frac{1}{365}\cr & \approx 4.37\cr \end{aligned} $$

5.4-3

You toss balls into $b$ bins until some bin contains two balls. Each toss is independent, and each ball is equally likely to end up in any bin. What is the expected number of ball tosses?

Let $B$ donate the number of ball tosses

$$ \begin{aligned} Pr(B=k)=Pr(B\geq k)\cdot Pr(B=k|B\geq k)\cr Pr(B\geq k)=\frac{P(b,k-1)}{b^{k-1}}\cr Pr(B=k|B\geq k)=\frac{k-1}{b}\cr E[B]=\sum_{k=2}^{b+1}\frac{P(b,k-1)}{b^{k-1}}\cdot \frac{k-1}{b}\cdot k\cr \because Pr(B\geq k) = Pr(B=k,k+1,...,b+1)\cr \therefore \sum_{k=t}^{b+1}\frac{P(b,k-1)}{b^{k-1}}\cdot \frac{k-1}{b}=\frac{P(b,k-1)}{b^{k-1}}\cr \therefore E[B] = 1 + \sum_{k=1}^{b}\frac{P(b,k)}{b^{k}}\cr \sum_{k=1}^{b}\frac{P(b,k)}{b^{k}}\approx \sqrt{\frac{\pi b}{2}}-\frac{1}{3} \end{aligned} $$

$\star$ 5.4-4

For the analysis of the birthday paradox, is it important that the birthdays be mutually independent, or is pairwise independence sufficient? Justify your answer.

Pairwise independence is sufficient, since only if the probability that each pair fall on same birthday is $1/n$, we can finish all the analysis after (5.7).

$\star$ 5.4-5

How many people should be invited to a party in order to make it likely that there are three people with the same birthday?

The question is not clear, I guess it means: If you invite people one by one until there are three people with the same birthday, what's the expected value of the people number?

Use trail and error we know the answer is 88

An approximate method is:

Let $Pr(i,j,k)$ be the probability that $i,j,k$ have same birthday.

$Pr(i,j,k)=\frac{1}{n^2} \quad1\leq i\leq j \leq k \leq p$

Let $N$ be the triplets number.

$$ \begin{aligned} E[N] & = {p\choose 3}Pr(i,j,k)\cr & = \frac{p!}{3!(p-3)!n^2}\cr & = \frac{p(p-1)(p-2)}{6n^2}\cr & \geq 1\cr \implies p & \geq 94\cr \end{aligned} $$

Or if we want to find the smallest number of people that they have at least 50% chance to have $W$ triplet to have same birthday.

if a year has $m$ days, and there are $n$ people.

this artical tell us the probability is:

$$ \begin{aligned} P(W\geq 1) & = 1 -\sum_{i=0}^{[n/2]}\frac{m!n!}{i!(n-2i)!(m-n+i)!2^im^n} \end{aligned} $$

if $m=365, P(W\geq 1)=0.511$ for $n=88$, $P(W\geq 1)=0.499$ for $n=87$

$\star$ 5.4-6

What is the probability that a $k$-string (defined on page 1179) over a set of size $n$ forms a $k$-permutation? How does this question relate to the birthday paradox?

$$ \begin{aligned} Pr(k\text{-perm in k-string}) & = \frac{P(n,k)}{n^k}\cr & = \frac{n!}{(n-k)!n^k}\cr \end{aligned} $$

It is the complementary event to birthday paradox, that is, the probability that $k$ people have distinct birthdays.

$\star$ 5.4-7

You toss $n$ balls into $n$ bins, where each toss is independent and the ball is equally likely to end up in any bin. What is the expected number of empty bins? What is the expected number of bins with exactly one ball?

Let $E$ be the event that a bin is empty, let $B$ be the number of empty bins, let $O$ be the event that a bin has exactly one ball, let $N$ be the number of bins with exactly one ball.

$$ \begin{aligned} Pr(E) & = (\frac{n-1}{n})^{n}\cr E[B] & = n(\frac{n-1}{n})^{n}\cr Pr(O) & = \frac{1}{n} \cdot (\frac{n-1}{n})^{n-1} \cdot {n\choose 1}\cr & = (\frac{n-1}{n})^{n-1}\cr E[N] & = nPr(O)\cr & = n(\frac{n-1}{n})^{n-1}\cr \end{aligned} $$

$\star$ 5.4-8

Sharpen the lower bound on streak length by showing that in n flips of a fair coin, the probability is at least $1-1/n$ that a streak of length $\lg n-2\lg\lg n$ consecutive heads occurs.

Let A_{i,j} be the event that a streak of heads of length at least j begins with the $i$th coin flip.

$$ \begin{aligned} Pr(\bigcap_{i=1}^{n-\lg n + 2\lg\lg n} \neg A_{i,\lg n - 2\lg\lg n}) & \leq \prod_{i=1}^{ n/(\lg n - 2\lg\lg n)}Pr(\neg A_{i(\lg n - 2\lg\lg n),\lg n - 2\lg\lg n})\cr & = \prod_{i=1}^{ n/(\lg n - 2\lg\lg n)} (1-\frac{1}{2^{\lg n - 2\lg\lg n}})\cr & = \prod_{i=1}^{ n/(\lg n - 2\lg\lg n)} (1-\frac{\lg^{2}n}{n})\cr & \leq (e^{-\frac{\lg^{2}n}{n}})^{n/(\lg n - 2\lg\lg n)}\cr & = e^{\frac{-\lg^{2}n}{\lg n-2\lg\lg n}}\cr & = \frac{1}{n}e^{\frac{-2(\lg n)\lg\lg n}{\lg n-2\lg\lg n}}\cr & \leq \frac{1}{n}\cr Pr(\bigcup_{i=1}^{n-\lg n + 2\lg\lg n} A_{i,\lg n - 2\lg\lg n}) & = 1 - Pr(\bigcap_{i=1}^{n-\lg n + 2\lg\lg n} \neg A_{i,\lg n - 2\lg\lg n})\cr & \geq 1-\frac{1}{n}\cr \end{aligned} $$