Application Center - Maplesoft

App Preview:

The Normal Distribution and the Central Limit Theorem

You can switch back to the summary page by clicking here.

Learn about Maple
Download Application


 

Image 

 

 

The Normal Distribution and the Central Limit Theorem 

 

 

The following was implemented in Maple by Marcus Davidsson (2008) davidsson_marcus@hotmail.com 

and is based upon the work by Zietz (2004) Dynamic Programming: An Introduction by Example 

 

 

 

 

 

 

1) Theoretical Distribution 

 

 

restart 

 

 

We start by noting that 

 

If we toss the coin one time we can either get 

 

T, H 

T, H (1)
 

If we toss the coin two time we can either get 

 

TT, HH, TH, HT 

TT, HH, TH, HT (2)
 

If we toss the coin three time we can either get 

 

HHH, TTT, THH, HTH, HHT, HTT, THT, TTH 

HHH, TTT, THH, HTH, HHT, HTT, THT, TTH (3)
 

 

 

We now note that we can calculate the total number of combinations after a n coin toss as 

 

`*`(total, `*`(number, `*`(of, `*`(combinations)))) = `^`(2, n) 

`*`(total, `*`(number, `*`(of, `*`(combinations)))) = `^`(2, n) (4)
 

for example 

 

for a one coin toss the total number of combinations are given by 

 

2 

2 (5)
 

for a two coin toss the total number of combinations are given by 

 

`^`(2, 2) 

4 (6)
 

for a three coin toss the total number of combinations are given by 

 

`^`(2, 3) 

8 (7)
 

 

 

We now note that  

 

 

If we toss the coin one time we can either get 

 

T, H 

T, H (8)
 

which means that we can either get 1 zero head or 1 one head   

 

 

 

If we toss the coin two time we can either get 

 

TT, HH, TH, HT 

TT, HH, TH, HT (9)
 

which means that we can either get 1 zero head, 1 two heads or 2 one heads   

 

 

 

If we toss the coin three time we can either get 

 

HHH, TTT, THH, HTH, HHT, HTT, THT, TTH 

HHH, TTT, THH, HTH, HHT, HTT, THT, TTH (10)
 

 

which means that we can either get 1 three heads, 1 zero heads, 3 two heads or 3 one heads   

 

 

 

 

The above argument can be generalized into the Pascal Triangle as seen below 

 

 

Image 

 

 

 

The first thing to note is that each cell is the sum of the two cells above. 

 

The second thing to note is that each element in the triangle is number of successful combinations. 

 

for example the left-hand side of the triangle gives us the total number of outcomes that has zero heads for different coin tosses 

 

if we move one step to the right we get the total number of outcomes that has one heads for different coin tosses 

 

 

 

We now note that we could have calculated the value in each cell  (the number of successful combinations) by using the below formula  

 

where ! is the factorial notation for example  `and`(factorial(3) = `*`(3, 2), `*`(3, 2) = 6) 

 

`*`(number, `*`(of, `*`(succesful, `*`(combinations)))) = `/`(`*`(factorial(n)), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))) 

`*`(number, `*`(of, `*`(succesful, `*`(combinations)))) = `/`(`*`(factorial(n)), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))) (11)
 

for example when n=4 and x=1 we get 

 

`/`(`*`(factorial(4)), `*`(factorial(1), `*`(factorial(`+`(4, -1))))) 

4 (12)
 

which is also the number we can find in the above triangle where n=4 and x=1 

 

 

 

Now if we now want to find the probability of finding one head (x=1) after we toss the coin four times (x=4) we can simply look at the ratio of number  

 

of successful combinations / number of total combinations. We can therefore rewrite the Pascal triangle triangle as seen below. 

 

 

Image 

 

 

We now note that 

 

the probability of 2 heads (or zero heads) after 2 tosses 

 

`*`(.5, .5) 

.25 (13)
 

the probability of 3 heads (or zero heads) after 3 tosses 

 

`+`(`*`(.5, `*`(`*`(.5, .5)))) 

.125 (14)
 

the probability of 4 heads (or zero heads) after 4 tosses 

 

`*`(`+`(`*`(.5, `*`(`*`(.5, .5)))), .5) 

0.625e-1 (15)
 

 

 

We can calculate the probability of finding x heads in a n random coin toss (p=0.5) as follows: 

 

P(n, x) = `/`(`*`(factorial(n), `*`(`^`(p, x), `*`(`^`(`+`(1, `-`(p)), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))); 1 

P(n, x) = `/`(`*`(factorial(n), `*`(`^`(p, x), `*`(`^`(`+`(1, `-`(p)), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))) (16)
 

 

for example 

 

We can calculate the probability of finding 3 heads (x) after 3 coin toss (n) as  

 

`:=`(n, 3); -1; `:=`(x, 3); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x))))));... 

.1250000000 (17)
 

 

We can calculate the probability of finding 2 heads (x) after 3 coin toss (n) as  

 

`:=`(n, 3); -1; `:=`(x, 2); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x))))));... 

.3750000000 (18)
 

 

We can calculate the probability of finding 4 heads after 4 coin toss as  

 

`:=`(n, 4); -1; `:=`(x, 4); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x))))));... 

0.6250000000e-1 (19)
 

 

 

We now note that when the number of coin tosses is large the Binominal distribution will converge to the normal distribution 

 

which is outlined below. 

 

Image 

 

 

 

 

 

2) Hypothesis Testing 

 

 

 

restart 

 

We now note that we can also calculate the cumulative probability of finding x heads after a n coin toss 

 

 

Example-1 

We can calculate the probability of finding 10 heads in a 20 coin toss as  

 

`:=`(n, 20); -1; `:=`(x, 10); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))... 

.1761970520 (20)
 

 

We can calculate the probability of finding 10 heads or more in a 20 coin toss as 

 

`:=`(n, 20); -1; `:=`(x, 10); -1; convert([seq(`/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))), x = x .. n)], '`+`'); ... 

.5880985260 (21)
 

If that probability is lower than 0.05 then we can claim with 95% certainty that the coin most likely is not random. In this case the coin is random  

Example-2 

We can calculate the probability of finding 14 heads in a 20 coin toss as  

 

`:=`(n, 20); -1; `:=`(x, 14); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))... 

0.3696441650e-1 (22)
 

 

We can calculate the probability of finding 14 heads or more in a 20 coin toss as 

 

`:=`(n, 20); -1; `:=`(x, 14); -1; convert([seq(`/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))), x = x .. n)], '`+`'); ... 

0.5765914916e-1 (23)
 

If that probability is lower than 0.05 then we can claim with 95% certainty that the coin most likely is not random. In this case the coin is random  

 

 

 

 

Example-3 

We can calculate the probability of finding 15 heads in a 20 coin toss as  

 

`:=`(n, 20); -1; `:=`(x, 15); -1; `:=`(v, `+`(`+`(20, `-`(x)), 1)); -1; `/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))... 

0.1478576660e-1 (24)
 

 

We can calculate the probability of finding 15 heads or more in a 20 coin toss as 

 

`:=`(n, 20); -1; `:=`(x, 15); -1; convert([seq(`/`(`*`(factorial(n), `*`(`^`(.5, x), `*`(`^`(`+`(1, -.5), `+`(n, `-`(x)))))), `*`(factorial(x), `*`(factorial(`+`(n, `-`(x)))))), x = x .. n)], '`+`'); ... 

0.2069473266e-1 (25)
 

If that probability is lower than 0.05 then we can claim with 95% certainty that the coin most likely is not random. In this case the coin is not random  

 

 

 

Gambling Fallacy 

 

If we conclude that the coin is random then the outcome in each period is completely random. 

 

This means that we cannot quantify the probability of getting a head in the next period because the outcome an all periods are completely random 

           However if we can prove that the outcome has not been generated by a random process then we can quantify the probability of success without falling in to the gambling fallacy. 

 

 

3) Empirical Distribution 

 

 

The law of large numbers and the central limit theorem are two important theorems in statistics.  

 

I will illustrate the basic mechanics by again considering a simple coin toss example. 

 

We assume that we have a room filled with nn number of people. Each person is asked to flip a coin 10 times and write down the total number of heads they get. 

 

 

We can now plot the distribution of the total number of heads that each person observed by assigning the total number of observed heads to the x-axis and the frequency                         

of the observed number of heads on the y-axis 

 

 

We note that when the number of observations is large the distribution will converge to the theoretical distribution suggested by the Pascal triangle (normal distribution). 

 

 

 

 

3.1) When the number of people (nn) is small 

 

 

The room contain 5 people each person is asked to toss the coin 10 times and write down the number of heads they get. 

 

 

restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 5); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); -1;...
 

Plot_2d
 

 

 

We can see that our distribution is both unstable (changes every time we run the code) and does not resemble the normal distribution suggested by the Pascal triangle   

 

 

 

 

3.2) When the number of people (nn) is large 

 

 

restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
restart; -1; `:=`(n, 10); -1; `:=`(nn, 1000); -1; randomize(); -1; `:=`(coin, rand(0 .. 1)); -1; `:=`(coin_1, proc (n) seq(coin(), i = 1 .. n) end proc); -1; `:=`(x_1, seq([coin_1(n)], i = 1 .. nn)); ...
 

Plot_2d
 

 

 

 

We can now see that our distribution is stable (does not change every time we run the code) and it resembles the normal distribution suggested by the Pascal triangle   

 

We can also see that the distribution is centered around the mean equal to 5. This also makes sense since for a random coin we should get head approximately half of the time and  

 

tails approximately half of the time when n is large. 

 

 

Legal Notice: The copyright for this application is owned by the authors. Neither Maplesoft nor the authors are responsible for any errors contained within and are not liable for any damages resulting from the use of this material.. This application is intended for non-commercial, non-profit use only. Contact the authors for permission if you wish to use this application in for-profit activities.