Testing of Hypothesis

Parameters and Statistics

Parameters refer to the defining characteristics of a population, e.g. average household size, average income, etc.. In a normal distribution, it is defined by the mean and standard deviation (ref. formula for the normal distribution).

Statistics are used to estimate these parameters. However, there are non-parametric statistics which are used for populations with no assumed defining parameters.

    In testing of hypothesis, we are interested in

  1. whether a parameter is equal to a certain value (e.g. the probability of obtaining ??in throwing a dice = 1/6), or
  2. whether the parameters of two populations are equal (such as the average weight of children in a two-parents family is equal to that of children in single-parent family.)

Deductive Logic

Formal logic is also called deductive logic. One of the laws in logic that is relevant to testing of hypothesis is:

        If A implies B (A -> B), then not B implies not A (~B -> ~A).

        ["A-> B" can be read as "If A is true then B is also true" .]

        ["~B -> ~A" can be read as "If B is false then A is also false".

    For example, if the statement "If it rains, this floor will get wet" is true, then if this floor is not wet, it is not raining.

    However, If A -> B, it is not necessary true that ~A -> ~B. In the above example, if it is not raining, the floor may still be wet (e.g. someone pours water onto it.)

    Furthermore, If A -> B, it is also not necessary for B -> A. In the above example, if the floor is wet, it does not mean that it is necessarily raining.

Logic of hypothesis testing

    In real life, we rarely find situation where (A -> B) can fully apply. In real life, we are always dealing with incomplete data (such as having samples instead of studying the whole population. Hypothesis testing works out like this:

    If A is true, the probability of B to occur is low, but we now observe that B has occurred, therefore we conclude that A is probably false; or [this is called rejecting the hypothesis]

    If A is true, the probability for B to occur is not low, though we have observed that B occurred, we cannot conclude that A is false. [not rejecting the hypothesis]

Null Hypothesis

    The null hypothesis is the hypothesis that we base on to calculate probabilities of random variables to test if a certain assertion about a parameter is correct.

    For example, when we observed 7 "6" in an experiment of casting a dice 20 times, we would like to test if the dice is fair (i.e. balanced or unbiased). To test this we start with a null hypothesis so that we can calculate probabilities:

Ho: The dice is fair, i.e. Probability of having a "6" in one throw is 1/6. [P(6) = 1/6]

    Basing on the above null hypothesis, we can calculate the probabilities of various events and determine whether we can reject the hypothesis.

    If the dice is fair, in casting the dice 20 times, the probability of observing

    p(0 "6") = (5/6)20 = 0.026084

    p(1 "6")= 20C1 (1/6) (5/6)19 = 0.13042

    p(2 "6")= 20C2 (1/6)2 (5/6)18 = 0.328659

    etc.

To test the hypothesis, we ask "What is the probability of observing 7 or more "6" in casting a dice 20 times." i.e. = 0.0371 [The reason we ask 7 or more"6" instead of just 7 "6" is because that we suspect that the dice is biased and if so we would expect to observe more "6". Moreover, the probability of observing exactly 7 "6" or any number of  "6" is usually very small.] Since if Ho is true, the probability of observing 7 "6" is so low (< 0.05 as a convention), then we can eject the null hypothesis.

Alternate Hypothesis (Ha or H1)

    Literally, it is the hypothesis that we will accept when the null hypothesis is rejected. In most cases, it is the hypothesis that is related to what we considered to be wrong with the Ho. In the above example, we would have:

H1: P(6) > 1/6

Type I and Type II error

¡@

Do not reject Ho

Reject Ho

Ho is true

OK

Type I error

Ho is false

Type II error

OK

Type I error (a ) is the probability of rejecting Ho when it is in fact true.

Type II error (b ) is the probability of not rejecting Ho when it is in fact false.