# Proof

For a natural number $$n\ge 1$$, let $$f_n(A)$$ be the absolute frequency, in which the event $$A$$ could occur, if we repeat the Bernoulli experiment $$n$$ times. We can interpret $$f_n$$ as a random variable counting the number $$k$$ of the realizations of $$A$$ with $$0\le f_n(A)=k\le n$$. With this interpretation, it follows from the binomial distribution that the probability mass function of $$f_n$$ is given by

$p(f_n=k)=\binom nk P^k(1-P)^{n-k}\quad\quad(k=0,1,\ldots n),\quad\quad( * )$

where $$P:=p(A)$$ denotes the probability of the event $$A$$. Similarly, the relative frequency $$F_n(A):=f_n(A)/n$$ can be interpreted as a random variable $$F_n$$ with the possible realizations $$0\le F_n\le n/n=1$$. The probability mass function of $$F_n$$ does not change
$p\left(F_n=\frac kn\right)=\binom nk P^k(1-P)^{n-k}\quad\quad(k=0,1,\ldots n),$ because $F_n=\frac kn\Longleftrightarrow f_n=k.\quad\quad(* *)$

Let $$\epsilon > 0$$ be an arbitrarily small real number. Note that the events $$(F_n < P - \epsilon)$$ and $$(F_n > P + \epsilon)$$ in each Bernoulli experiment repeated $$n$$ times are mutually exclusive. From the definition of probability and the definition of absolute value it follows that

$p(|F_n-P| > \epsilon)=p(F_n < P-\epsilon) + p(F_n > P+\epsilon).$

Because of $$( * * )$$ and $$( * )$$ , we can conclude further that

$\begin{array}{rcl} p(|F_n-P| > \epsilon)&=&p(f_n < n(P-\epsilon)) + p(f_n > n(P+\epsilon))\\ &=&\sum_{k < n(P-\epsilon)}p(f_n=k)+\sum_{k > n(P+\epsilon)}p(f_n=k)\\ &=&\sum_{k < n(P-\epsilon)}\binom nk P^k(1-P)^{n-k}+\sum_{k > n(P+\epsilon)}\binom nk P^k(1-P)^{n-k}. \end{array}\quad\quad ( * * * )$

Applying the distributivity law for real numbers and the rules of calculation with inequations, we observe that the summation indices $$k$$ of both sums on the right side of $$( * * * )$$ fulfill a common property $$( \times )$$

$\left.\begin{array}{rclcr} k < n(P-\epsilon)&\Rightarrow& k < nP-n\epsilon&\Rightarrow& n\epsilon < nP-k\\ k > n(P+\epsilon)&\Rightarrow& k > nP+n\epsilon&\Rightarrow& k - nP > n\epsilon \end{array}\right\}\Rightarrow(n\epsilon)^2 < (nP-k)^2\quad\quad ( \times )$

By multiplying the respective left and right sides of the equation $$( * * * )$$ by the inequation $$( \times )$$, we get the inequation

$\begin{array}{rcl} (n\epsilon)^2p(|F_n-P| > \epsilon)& < &\sum_{k < n(P-\epsilon)}(nP-k)^2\binom nk P^k(1-P)^{n-k}+\sum_{k > n(P+\epsilon)}(nP-k)^2\binom nk P^k(1-P)^{n-k}. \end{array}$

Because $$p(f_n=k)=0$$ for all $$k < 0$$ and all $$k > n$$, we get finally the inequation

$\begin{array}{rcl} (n\epsilon)^2p(|F_n-P| > \epsilon)& < &\sum_{k =0}^n(nP-k)^2\binom nk P^k(1-P)^{n-k}\\ &=&\sum_{k =0}^n(n^2P^2 - 2nPk +k^2)\binom nk P^k(1-P)^{n-k}\\ &=&\underbrace{n^2P^2\sum_{k =0}^n\binom nk P^k(1-P)^{n-k}}_{=:S_1}-\underbrace{2nP\sum_{k =0}^nk\binom nk P^k(1-P)^{n-k}}_{=:S_2}+\underbrace{\sum_{k =0}^nk^2\binom nk P^k(1-P)^{n-k}}_{=:S_3}\\ \end{array}\quad\quad ( \times\times )$

Applying the sum of binomial coefficients (i) for all $$n\ge 0$$, we obtain the result:

$S_1=n^2P^2\sum_{k =0}^n\binom nk P^k(1-P)^{n-k}=n^2P^2\cdot 1=n^2P^2.$

Similarly, applying the sum of binomial coefficients (ii) for all for all $$n\ge 1$$, we obtain the result:

$S_2=-2nP\sum_{k =0}^nk\binom nk P^k(1-P)^{n-k}=-2nP\cdot nP=-2n^2P^2.$

For the third sum, applying the sum of binomial coefficients (iii) , we get for $$n\ge 0$$:

$S_3=\sum_{k =0}^nk^2\binom nk P^k(1-P)^{n-k}=nP(1-P)+n^2P^2.$

Thus, it follows from $$(\times\times)$$ and because the $$P(1-P)$$ for $$P\in[0,1]$$ has its maximum at $$\frac 12$$:

$\begin{array}{rcll} (n\epsilon)^2p(|F_n-P| > \epsilon)& < &nP(1-P)\\ &\Longleftrightarrow&\\ p(|F_n-P| > \epsilon)& < &\frac{P(1-P)}{n\epsilon^2}&\text{for a fixed }\epsilon > 0\text{ and all }n\ge 1\\ &\Longleftrightarrow&\\ p(|F_n-P| > \epsilon)& < &\frac{1}{4n\epsilon^2}&\text{for a fixed }\epsilon > 0\text{ and all }n\ge 1.\\ \end{array}\quad\quad ( \times\times\times )$

It follows

$\lim_{n\to\infty}p(|F_n(A)-P| > \epsilon)=0\quad\quad\text{for every fixed }\epsilon > 0.$

The probability of the complement event is given by

$\lim_{n\to\infty}p(|F_n(A)-P|\le \epsilon)=1\quad\quad\text{for every fixed }\epsilon > 0.$

Therefore, it is almost certain that the sequence members $$F_n$$ will approximate the probability $$P$$ with virtually any accuracy, if $$n$$ is large enough.

Thank you to the contributors under CC BY-SA 4.0!

Github:

### References

#### Bibliography

1. Bosch, Karl: "Elementare Einführung in die Wahrscheinlichkeitsrechnung", vieweg Studium, 1995, 6th Edition