BIGpedia.com - Checking if a coin is fair - Encyclopedia and Dictionary Online
encyclopedia search

Checking if a coin is fair


Sometimes when choosing a coin (particularly for a coin flip), it may be desirable to determine if the coin is fair – that is, if the probability of obtaining a given side (commonly heads or tails) in the toss is 50%.

Contents

Posterior probability density function

One way of verifying this is to calculate the posterior probability density function of Bayesian probability theory.

A test is performed by tossing the coin n times and noting the number of heads h and tails t:

H = h (Total number of heads is h)
T = t (Total number of tails is t)
N = n = h + t (Total number of tosses is n)

Next, let r be the actual probability of obtaining heads in a single toss of the coin. This is the value desired. Using Bayes' theorem, posterior probability of r conditional on H and T is expressed as follows:

f(r | H=h, T=t) =    \frac {\Pr(H=h | r, N=h+t) \, f(r)} {\int_0^1 \Pr(H=h |r, N=h+t) \, f(r) \, dr}. \!

The prior summarizes what is known about the distribution of r in the absence of any observation. It is assumed in this case that the prior distribution of r is uniform over the interval [0, 1]. That is, f(r) = 1. That assumption should be considered provisional – if some additional background information is found, the prior would be modified accordingly.

f(r) = 1

The probability of obtaining h heads in n tosses of a coin with a probability of heads equal to r is given by a binomial distribution:

\Pr(H=h | r, N=h+t) = {h+t \choose h} \, r^h \, (1-r)^t. \!

Putting it all together:

f(r | H=h, T=t)  = \frac{{h+t \choose h}\,r^h\,(1-r)^t}         {\int_0^1 {h+t \choose h}\,r^h\,(1-r)^t\,dr}  = \frac{r^h\,(1-r)^t}{\int_0^1 r^h\,(1-r)^t\,dr}  .

This is in fact a beta distribution (the conjugate prior for the binomial distribution), whose denominator can be expressed in terms of the beta function:

f(r | H=h, T=t) = \frac{1}{\mathrm{B}(h+1,t+1)} \; r^h\,(1-r)^t. \!

Because a uniform prior is assumed, and because h and t are integers, this can also be written in terms of factorials:

f(r | H=h, T=t) = \frac{(h+t+1)!}{h!\,\,t!} \; r^h\,(1-r)^t. \!

Example

For example, let n=10, h=7, i.e. the coin is tossed 10 times and 7 heads are obtained:

f(r | H=7, T=3) = \frac{(7+3+1)!}{7!\,\,3!} \; r^7 \, (1-r)^3 = 1320 \, r^7 \, (1-r)^3 \!

The graph on the right shows the probability density function of r given that 7 heads were obtained in 10 tosses. (Note: r is the probability of obtaining heads when tossing the same coin once.)

It is likely that the coin is indeed biased because the probability of an unbiased coin

\Pr(0.45 < r <0.55)  = \int_{0.45}^{0.55} f(r | H=7, T=3) \,dr  \approx 13\%  \!

is quite small when compared with alternative hypothesis (a biased coin).


It is notable that the shape of the plotted curve is fully determined by the numerator r^h \, (1-r)^t while the denominator determines only the scaling of the plotted curve.

This means that the shape of the curve can be plotted using just the term r^h \, (1-r)^t and by observing the plotted curve, one can ascertain whether the coin is biased and the rough extent of the bias.

The value of r where f(r | H = h,T = t) attains its maximum value is the posterior mode, rmax = h / (h + t).

Warning!: Observation of 10 tosses of the coin is not enough to determine the true probability of obtaining head for the coin because the value of the maximum acceptable error of the estimator of the true value (at 95% confidence level) is extremely high. The graph of the probability density function (of r) shows the true value of r can range from 0.3 to 0.97 , thus if possible further coin tosses should be performed to further restrict the shape of the curve in the pdf.

How many times should the coin be tossed

To determine the number of times, a coin should be tossed, two vital pieces of information are required:

  1. The confidence interval (Z)
  2. The maximum acceptable error (E)
  • The confidence level is denoted by Z and is given by the Z-value of a standard normal distribution. This value can be read off the standard statistics table.
Z = 1.0 gives 68.27% confidence
Z = 2.0 gives 95.45% confidence
Z = 3.0 gives 99.73% confidence
Z = 3.3 gives 99.90% confidence
  • The maximum acceptable error is defined by | p - pactual | < E where p\,\! is the estimated probability of obtaining heads. Note: p\,\! is the same estimator as the estimator r\,\! of the previous section in this article.
  • In statistics, the estimate of a proportion of a sample is denoted by p. This estimate had a standard error (standard deviation of error) given by:
s_p = \sqrt{ \frac {p \, (1-p) } {n} }

This standard error sp will have a maximum value if p = (1 - p) = 0.5.

Hence , assuming the worse case , p is set to 0.5 to get the maximum possible value of sp and the value of maximum acceptable error given by

E\,\! = Z \, s_p = Z \sqrt{ \frac {p \, (1-p) } {n} } = Z \sqrt{ \frac {0.5 \times 0.5 } {n} }
= Z \sqrt{ \frac { 1 } {4 \, n} } = \frac {Z}{2 \, \sqrt{n}}

Therefore, the formula for the number of coin tosses is

E = \frac {Z}{2 \, \sqrt{n}} \!

provided that n \cdot p \ge 5 and n \cdot q \ge 5 where q = (1-p)\, to satisfy the Central Limit Theorem.

Example

1. If the maximum error of 0.01 is desired, how many times should the coin be tossed?

E = \frac {Z}{ 2 \, \sqrt{n} }
n = \frac {Z^2} {4 \, E^2} = \frac {Z^2} {4 \times 0.01^2} = 2500 \ Z^2
n = 2500\, at 68.27% confidence (Z=1)
n = 10000\, at 95.45% confidence (Z=2)
n = 27225\, at 99.90% confidence (Z=3.3)

2. If the coin is tossed 10000 times, what is the maximum error of the estimated value of p (obtaining heads)?

E = \frac {Z}{ 2 \, \sqrt{n} }
E = \frac {Z}{ 2 \, \sqrt{ 10000 } } = \frac {Z}{ 200 }
E = 0.005\, at 68.27% confidence (Z=1)
E = 0.01\, at 95.45% confidence (Z=2)
E = 0.0165\, at 99.90% confidence (Z=3.3)

Other applications

The above mathematical analysis for determining if a coin is fair can also be applied to other uses. For example:

  • Determining the product defective rates of a product when subjected to a particular (but well defined) condition. Sometimes a product can be very difficult or expensive to make. Furthermore if testing such products will result in their destruction, a minimum amount of products should be tested. Using the same analysis the probability density function of the product defect rate can be found.
  • Two party polling. If a small sample poll is taken where the there are only two mutually exclusive choices, then this is equivalent to tossing a single coin multiple times using a bias coin. The same analysis can therefore be applied to determine actual voting ratio.
  • Finding the proportion of females in an animal group. Determining the gender ratio in a large group of an animal species. Provided that a very small sample is taken when performing the random sampling of the population, the analysis is similar to determining the probability of obtaining heads in a coin toss.

See also



The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License.
How to see transparent copy

01-04-2007 01:21:04