Biostat Last Research

  • Upload
    bidaaah

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

  • 8/8/2019 Biostat Last Research

    1/40

    RESEARCH

    IN

    BIOSTAT

    Submitted by:

    Alonzo, Jessa Marie

    Balbin, Carmina

    Magalona, Norie Rose

    Magsalin, Alexander Hubert

    Natividad, Leslie Ann

    Valdez, Darrel Jan

    Submitted to:

    Mr. Joselito Roque

    (Professor)

  • 8/8/2019 Biostat Last Research

    2/40

    1. Test of HypothesisA. Differentiate:

    i. Null and Alternative Hypothesisii. One-tailed Test and Two-tailed Test

    B. What is the Level of Significance / Critical Value?

    C. Test of Significance

    2. Differentiate Parametric and Non-parametric Test3. Define and determine when it is appropriate to use:

    y Z-testy T-testy Correlation and Regressiony Analysis of Variancey Chi-square Test

    Illustrate examples each.

    4. Types of of Non-parametric Test - Valdez, Darrel Jan

    Magalona, Norie Rose

    Magsalin, Alexander Hubert

    Balbin, Carmina

    Natividad, Leslie Ann

    Alonzo, Jessa Marie

  • 8/8/2019 Biostat Last Research

    3/40

    1) Test of HypothesisA. Differentiate:

    i. Null and Alternative HypothesisNull hypothesis

    The null hypothesis is an hypothesis about a population parameter. The purpose of

    hypothesis testing is to test the viability of the null hypothesis in the light of experimental

    data. Depending on the data, the null hypothesis either will or will not be rejected as a

    viable possibility.

    Consider a researcher interested in whether the time to respond to a tone is affected by

    the consumption of alcohol. The null hypothesis is that 1- 2 = 0 where 1is the mean

    time to respond after consuming alcohol and 2 is the mean time to respond otherwise.

    Thus, the null hypothesis concerns the parameter 1- 2and the null hypothesis is that the

    parameter equals zero.

    The null hypothesis is often the reverse of what the experimenter actually believes; it is

    put forward to allow the data to contradict it. In the experiment on the effect of alcohol,

    the experimenter probably expects alcohol to have a harmful effect. If the experimental

    data show a sufficiently large effect of alcohol, then the null hypothesis that alcohol has

    no effect can be rejected.

    It should be stressed that researchers very frequently put forward a null hypothesis in the

    hope that they can discredit it. For a second example, consider an educational researcher

    who designed a new way to teach a particular concept in science, and wanted to test

    experimentally whether this new method worked better than the existing method. The

    researcher would design an experiment comparing the two methods. Since the null

    hypothesis would be that there is no difference between the two methods, the researcher

  • 8/8/2019 Biostat Last Research

    4/40

    would be hoping to reject the null hypothesis and conclude that the method he or she

    developed is the better of the two.

    The symbol H0 is used to indicate the null hypothesis. For the example just given, the null

    hypothesis would be designated by the following symbols:

    H0: 1- 2 = 0

    or by

    H0: 1= 2.

    The null hypothesis is typically a hypothesis of no difference as in this example where it

    is the hypothesis of no difference between population means. That is why the word "null"

    in "null hypothesis" is used -- it is the hypothesis of no difference.

    Despite the "null" in "null hypothesis," there are occasions when the parameter is not

    hypothesized to be 0. For instance, it is possible for the null hypothesis to be that the

    difference between population means is a particular value. Or, the null hypothesis could

    be that the mean SAT score in some population is 600. The null hypothesis would then be

    stated as: H0: = 600. although the null hypotheses discussed so far have all involved the

    testing of hypotheses about one or more population means, null hypotheses can involve

    any parameter. An experiment investigating the correlation between job satisfaction and

    performance on the job would test the null hypothesis that the population correlation ()

    is 0. Symbolically, H0: = 0.

    Some possible null hypotheses are given below:

    H0: =0

    H0: =10

    H0: 1 - 2 = 0

  • 8/8/2019 Biostat Last Research

    5/40

    H0: = .5

    H0: 1 - 2 = 0

    H0: 1= 2 = 3

    H0: 1- 2= 0

    When a one-tailed test is conducted, the null hypothesis includes the direction of the

    effect. A one-tailed test of the differences between means might test the null hypothesis

    that 1- 2 0. If M1- M2 were much less than 0 then the null hypothesis would be

    rejected in favor of the alternative hypothesis: 1- 2 < 0.

    Alternative hypothesis

    In statistical hypothesis testing, the alternative hypothesis (or maintained

    hypothesis or research hypothesis) and the null hypothesis are the two rival hypotheses

    which are compared by a statistical hypothesis test. An example might be where water

    quality in a stream has been observed over many years and a test is made of the null

    hypothesis that there is no change in quality between the first and second halves of the

    data against the alternative hypothesis that the quality is poorer in the second half of therecord.

    The concept of an alternative hypothesis in testing was devised by Jerzy

    Neyman and Egon Pearson, and it is used in the NeymanPearson lemma. It forms a

    major component in modern statistical hypothesis testing. However it was not part

    of Ronald Fisher's formulation of statistical hypothesis testing, and he violently opposed

    its use.[1] In Fisher's approach to testing, the central idea is to assess whether the observed

    dataset could have resulted from chance if the null hypothesis were assumed to hold,

    notionally without preconceptions about what other model might hold. Modern statistical

  • 8/8/2019 Biostat Last Research

    6/40

    hypothesis testing accommodates this type of test since the alternative hypothesis can be

    just the negation of the null hypothesis.

    ii. One-tailed Test and Two-tailed TestOne tailed test

    A statistical test in which the critical region consists of all values of a test statistic that are

    less than a given value or greater than a given value, but not both.

    We choose a critical region. In a one-tailed test, the critical region will have just one part (the

    red area below). If our sample value lies in this region, we reject the null hypothesis in favour

    of the alternative.

    Suppose we are looking for a definite decrease. Then the critical region will be to the left.

    Note, however, that in the one-tailed test the value of the parameter can be as high as you

    like.

    Example

    Suppose we are given that X has a Poisson distribution and we want to carry out a hypothesis

    test on the mean, , based upon a sample observation of 3.

  • 8/8/2019 Biostat Last Research

    7/40

    Suppose the hypotheses are:

    H0: = 9

    H1: < 9

    We want to test if it is "reasonable" for the observed value of 3 to have come from a Poisson

    distribution with parameter 9. So what is the probability that a value as low as 3 has come

    from a Po(9)?

    P(X 3) = 0.0212 (this has come from a Poisson table)

    The probability is less than 0.05, so there is less than a 5% chance that the value has come

    from a Poisson(3) distribution. We therefore reject the null hypothesis in favour of the

    alternative at the 5% level.

    However, the probability is greater than 0.01, so we would not reject the null hypothesis in

    favour of the alternative at the 1% level.

    Two-tailed test

    The two-tailed test is a statistical test used in inference, in which a given statistical

    hypothesis , H0 (null hypothesis) will be rejected when the value of the statistic is either

    sufficiently small or sufficiently large. The test is named after the "tail" of data under the far

    left and far right of a bell-shaped normal data distribution, or bell curve. However, the

    terminology is extended to tests relating to distributions other than normal.

    "In general a test is called two-sided or two-tailed if the null hypothesis is rejected for values

    of the test statistic falling into either tail of its sampling distribution, and it is called one-

    sided or one-tailed if the null hypothesis is rejected only for values of the test statistic falling

    into one specified tail of its sampling distribution".[1] For example, if our alternative

  • 8/8/2019 Biostat Last Research

    8/40

    hypothesis is , rejecting the null hypothesis of = 42.5 for small or for large

    values of the sample mean, the test is called two-tailed or two-sided. If our alternative

    hypothesis is > 1.4, rejecting the null hypothesis of only for large values of the

    sample mean, it is then called one-tailed or one-sided.

    If the distribution from which the samples are derived is considered to be normal, Gaussian,

    or bell-shaped, then the test is referred to as a one- or two-tailed T test. If the test is

    performed using the actual population mean and variance, rather than an estimate from a

    sample, it would be called a one- or two-tailed Z test.

    The statistical tables forZand fortprovide critical values for both one- and two-tailed tests.

    That is, they provide the critical values that cut off an entire alpha region at one or the other

    end of the sampling distribution as well as the critical values that cut off the 1/2 alpha regions

    at both ends of the sampling distribution.

    In a two-tailed test, we are looking for either an increase or a decrease. So, for example,

    H0 might be that the mean is equal to 9 (as before). This time, however, H1 would be that the

    mean is not equal to 9. In this case, therefore, the critical region has two parts:

    Example

    Lets test the parameter p of a Binomial distribution at the 10% level.

  • 8/8/2019 Biostat Last Research

    9/40

    Suppose a coin is tossed 10 times and we get 7 heads. We want to test whether or not the

    coin is fair. If the coin is fair, p = 0.5 . Put this as the null hypothesis:

    H0: p = 0.5

    H1: p 0.5

    Now, because the test is 2-tailed, the critical region has two parts. Half of the critical region

    is to the right and half is to the left. So the critical region contains both the top 5% of the

    distribution and the bottom 5% of the distribution (since we are testing at the 10% level).

    If H0 is true, X ~ Bin(10, 0.5).

    If the null hypothesis is true, what is the probability that X is 7 or above?

    P(X 7) = 1 - P(X < 7) = 1 - P(X 6) = 1 - 0.8281 = 0.1719

    Is this in the critical region? No- because the probability that X is at least 7 is not less than

    0.05 (5%), which is what we need it to be.

    So there is not significant evidence at the 10% level to reject the null hypothesis.

    B. What is the Level of Significance / Critical Value?Level of significance

    In statistics, a result is called statistically significant if it is unlikely to have occurred

    by chance. The phrase test of significance was coined by Ronald Fisher.

    As used in statistics,significantdoes not mean importantormeaningful, as it does in

    everyday speech. For example, a study that included tens of thousands of participants might

    be able to say with great confidence that residents of one city were more intelligent than

    people of another city by 1/20 of an IQ point. This result would be statistically significant,

  • 8/8/2019 Biostat Last Research

    10/40

    but the difference is small enough to be utterly unimportant. Many researchers urge that tests

    of significance should always be accompanied by effect-sizestatistics, which approximate the

    size and thus the practical importance of the difference.

    The amount of evidence required to accept that an event is unlikely to have arisen by chance

    is known as the significance level or critical p-value: in traditional Fisherianstatistical

    hypothesis testing, the p-value is the probability of observing data at least as extreme as that

    observed,given that the null hypothesis is true. If the obtained p-value is small then it can be

    said either the null hypothesis is false or an unusual event has occurred. It is worth stressing

    that p-values do not have any repeat sampling interpretation.

    An alternative statistical hypothesis testing framework is the Neyman-Pearson frequentist

    school which requires that both a null and an alternative hypothesis to be defined and

    investigates the repeat sampling properties of the procedure, i.e. the probability that a

    decision to reject the null hypothesis will be made when it is in fact true and should not have

    been rejected (this is called a "false positive" or Type I error) and the probability that a

    decision will be made to accept the null hypothesis when it is in fact false (Type II error).

    More typically, the significance level of a test is such that the probability of mistakenly

    rejecting the null hypothesis is no more than the stated probability. This allows the test to be

    performed using non-significant statistics which has the advantage of reducing the

    computational burden while wasting some information.

    It is worth stressing that Fisherian p-values are philosophically different from Neyman-

    Pearson Type I errors. This confusion is unfortunately propagated by many statistics

    textbooks.

    Use in practice

  • 8/8/2019 Biostat Last Research

    11/40

    The significance level is usually denoted by the Greek symbol (lowercase alpha). Popular

    levels of significance are 5% (0.05), 1% (0.01) and 0.1% (0.001). If a test of

    significance gives a p-value lower than the -level, the null hypothesis is rejected. Such

    results are informally referred to as 'statistically significant'. For example, if someone argues

    that "there's only one chance in a thousand this could have happened by coincidence," a

    0.001 level of statistical significance is being implied. The lower the significance level, the

    stronger the evidence required. Choosing level of significance is an arbitrary task, but for

    many applications, a level of 5% is chosen, for no better reason than that it is conventional.

    In some situations it is convenient to express the statistical significance as 1 . In general,

    when interpreting a stated significance, one must be careful to note what, precisely, is being

    tested statistically.

    Different -levels trade off countervailing effects. Smaller levels of increase confidence in

    the determination of significance, but run an increased risk of failing to reject a false null

    hypothesis (a Type II error, or "false negative determination"), and so have less statistical

    power. The selection of an -level thus inevitably involves a compromise between

    significance and power, and consequently between the Type I error and the Type II error.

    More powerful experiments - usually experiments with more subjects or replications - can

    obviate this choice to an arbitrary degree.

    In some fields, for example nuclear and particle physics, it is common to express statistical

    significance in units of "" (sigma), the standard deviation of a Gaussian distribution. A

    statistical significance of "n" can be converted into a value of via use of the error

    function:

  • 8/8/2019 Biostat Last Research

    12/40

    The use of implicitly assumes a Gaussian distribution of measurement values. For example,

    if a theory predicts a parameter to have a value of, say, 100, and one measures the parameter

    to be 109 3, then one might report the measurement as a "3 deviation" from the theoretical

    prediction. In terms of , this statement is equivalent to saying that "assuming the theory is

    true, the likelihood of obtaining the experimental result by coincidence is 0.27%" (since

    1 erf(3/2) = 0.0027).

    Fixed significance levels such as those mentioned above may be regarded as useful in

    exploratory data analyses. However, modern statistical advice is that, where the outcome of a

    test is essentially the final outcome of an experiment or other study, the p-value should be

    quoted explicitly. And, importantly, it should be quoted whether the p-value is judged to be

    significant. This is to allow maximum information to be transferred from a summary of the

    study into meta-analyses.

    Critical value

    In differential topology, a critical value of a differentiable function

    between differentiable manifolds is the value of a critical point .

    The basic result on critical values is Sard's lemma. The set of critical values can be quite

    irregular; but in Morse theory it becomes important to consider real-valued functions on a

    manifold M, such that the set of critical values is in fact finite. The theory of Morse

    functions shows that there are many such functions; and that they are even typical, or generic

    in the sense of Baire category.

    A critical value is used in significance testing. It is the value that a test statistic must exceed

    in order for the the null hypothesis to be rejected. For example, the critical value of t (with

    12 degrees of freedom using the 0.05 significance level) is 2.18. This means that for the

  • 8/8/2019 Biostat Last Research

    13/40

    probability value to be less than or equal to 0.05, the absolute value of the t statistic must be

    2.18 or greater. It should be noted that the all-or-none rejection of a null hypothesis is not

    recommended.

    It should be noted that the all-or-none rejection of a null hypothesis is not recommended.

    Statistics

    In statistics, a critical value is the value corresponding to a given significance level. This

    cutoff value determines the boundary between those samples resulting in a test statistic that

    leads to rejecting the null hypothesis and those that lead to a decision not to reject the null

    hypothesis. If the absolute value of the calculated value from the statistical test is greater than

    the critical value, then the null hypothesis is rejected in favour of the alternative hypothesis,

    and vice versa. You can never 'accept' an alternative hypothesis, you can only reject the null

    hypothesis in favour of the alternative.

    C. Test of SignificanceOnce sample data has been gathered through an observational study or experiment, statistical

    inference allows analysts to assess evidence in favor or some claim about the populationfrom which the sample has been drawn. The methods of inference used to support or reject

    claims based on sample data are known as tests of significance.

    Every test of significance begins with a null hypothesis H0.H0 represents a theory that has

    been put forward, either because it is believed to be true or because it is to be used as a basis

    for argument, but has not been proved. For example, in a clinical trial of a new drug, the nullhypothesis might be that the new drug is no better, on average, than the current drug. We

    would writeH0: there is no difference between the two drugs on average.

    The alternative hypothesis,Ha, is a statement of what a statistical hypothesis test is set up to

    establish. For example, in a clinical trial of a new drug, the alternative hypothesis might be

  • 8/8/2019 Biostat Last Research

    14/40

    that the new drug has a different effect, on average, compared to that of the current drug. We

    would writeHa: the two drugs have different effects, on average. The alternative hypothesis

    might also be that the new drug is better, on average, than the current drug. In this case we

    would writeHa: the new drug is better than the current drug, on average.

    The final conclusion once the test has been carried out is always given in terms of the null

    hypothesis. We either "rejectH0 in favor ofHa" or "do not rejectH0"; we never conclude

    "rejectHa", or even "acceptHa".

    If we conclude "do not rejectH0", this does not necessarily mean that the null hypothesis is

    true, it only suggests that there is not sufficient evidence againstH0 in favor ofHa; rejecting

    the null hypothesis then, suggests that the alternative hypothesis may be true.

    Hypotheses are always stated in terms of population parameter, such as the mean . An

    alternative hypothesis may be one-sidedortwo-sided. A one-sided hypothesis claims that a

    parameter is either largerorsmaller than the value given by the null hypothesis. A two-sided

    hypothesis claims that a parameter is simply not equalto the value given by the null

    hypothesis -- the direction does not matter.

    Hypotheses for a one-sided test for a population mean take the following form:

    H0: = k

    Ha: > k

    or

    H0: = k

    Ha: < k.

  • 8/8/2019 Biostat Last Research

    15/40

    Hypotheses for a two-sided test for a population mean take the following form:

    H0: = k

    Ha: k.

    A confidence intervalgives an estimated range of values which is likely to include an

    unknown population parameter, the estimated range being calculated from a given set of

    sample data. (Definition taken from Valerie J. Easton and John H. McColl's Statistics

    Glossary v1.1)

    Example

    Suppose a test has been given to all high school students in a certain state. The mean test

    score for the entire state is 70, with standard deviation equal to 10. Members of the school

    board suspect that female students have a higher mean score on the test than male students,

    because the mean score from a random sample of 64 female students is equal to 73.

    Does this provide strong evidence that the overall mean for female students is higher?

    The null hypothesisH0 claims that there is no difference between the mean score for female

    students and the mean for the entire population, so that = 70. The alternative hypothesis

    claims that the mean for female students is higher than the entire student population mean, so

    that > 70

    2) Differentiate Parametric and Non-Parametric Test

    Parametric test is a branch of statistics that assumes data has come from a type of probability

    distribution and makes inferences about the parameters of the distribution. Most well-known

    elementary statistical methods are parametric.

  • 8/8/2019 Biostat Last Research

    16/40

    Generally speaking parametric methods make more assumptions than non-parametric methods. If

    those extra assumptions are correct, parametric methods can produce more accurate and precise

    estimates. They are said to have more statistical power. However, if those assumptions are

    incorrect, parametric methods can be very misleading. For that reason they are often not

    considered robust. On the other hand, parametric formulae are often simpler to write down and

    faster to compute. In some, but definitely not all cases, their simplicity makes up for their non-

    robustness, especially if care is taken to examine diagnostic statistics.

    Because parametric statistics require a probability distribution, they are not distribution-free.

    History

    Statistician Jacob Wolfowitz coined the statistical term "parametric" in order to define its

    opposite in 1942:

    "Most of these developments have this feature in common, that the distribution functions of the

    various stochastic variables which enter into their problems are assumed to be of known

    functional form, and the theories of estimation and of testing hypotheses are theories of

    estimation of and of testing hypotheses about, one or more parameters. . ., the knowledge of

    which would completely determine the various distribution functions involved. We shall refer to

    this situation. . .as the parametric case, and denote the opposite case, where the functional forms

    of the distributions are unknown, as the non-parametric case."

    Example

    Suppose we have a sample of 99 test scores with a mean of100 and a standard deviation of10. If

    we assume all 99 test scores are random samples from a normal distribution we predict there is a

    1% chance that the 100th test score will be higher than 123.65 (that is the mean plus 2.365

    standard deviations) assuming that the 100th test score comes from the same distribution as the

    others. The normal family of distributions all have the same shape and areparameterizedby

  • 8/8/2019 Biostat Last Research

    17/40

    mean and standard deviation. That means if you know the mean and standard deviation, and that

    the distribution is normal, you know the probability of any future observation. Parametric

    statistical methods are used to compute the 2.365 value above, given

    99 independent observations from the same normal distribution.

    A non-parametric estimate of the same thing is the maximum of the first 99 scores. We don't

    need to assume anything about the distribution of test scores to reason that before we gave the

    test it was equally likely that the highest score would be any of the first 100. Thus there is a 1%

    chance that the 100th is higher than any of the 99 that preceded it.

    Non parametric test

    In statistics, the term non-parametric statistics has at least two different meanings:

    1. The first meaning ofnon-parametric covers techniques that do not rely on data belongingto any particular distribution. These include, among others:

    distribution free methods, which do not rely on assumptions that the data are drawnfrom a given probability distribution. As such it is the opposite ofparametric statistics.

    It includes non-parametric statistical models, inference and statistical tests.

    non-parametric statistics (in the sense of a statistic over data, which is defined to be afunction on a sample that has no dependency on a parameter), whose interpretation

    does not depend on the population fitting any parametrized distributions. Statistics

    based on the ranks of observations are one example of such statistics and these play a

    central role in many non-parametric approaches.

    2. The second meaning ofnon-parametric covers techniques that do not assume thatthestructure of a model is fixed. Typically, the model grows in size to accommodate the

    complexity of the data. In these techniques, individual variables are typically assumed to

  • 8/8/2019 Biostat Last Research

    18/40

    belong to parametric distributions, and assumptions about the types of connections

    among variables are also made. These techniques include, among others:

    non-parametric regression, which refers to modeling where the structure of the

    relationship between variables is treated non-parametrically, but where nevertheless

    there may be parametric assumptions about the distribution of model residuals.

    non-parametric hierarchical Bayesian models, such as models based on the Dirichletprocess, which allow the number of latent variables to grow as necessary to fit the

    data, but where individual variables still follow parametric distributions and even the

    process controlling the rate of growth of latent variables follows a parametric

    distribution.

    Applications and purpose

    Non-parametric methods are widely used for studying populations that take on a ranked order

    (such as movie reviews receiving one to four stars). The use of non-parametric methods may be

    necessary when data have a ranking but no clear numerical interpretation, such as when

    assessing preferences; in terms of levels of measurement, for data on an ordinal scale.

    As non-parametric methods make fewer assumptions, their applicability is much wider than the

    corresponding parametric methods. In particular, they may be applied in situations where less is

    known about the application in question. Also, due to the reliance on fewer assumptions, non-

    parametric methods are more robust.

    Another justification for the use of non-parametric methods is simplicity. In certain cases, even

    when the use of parametric methods is justified, non-parametric methods may be easier to use.

    Due both to this simplicity and to their greater robustness, non-parametric methods are seen by

    some statisticians as leaving less room for improper use and misunderstanding.

  • 8/8/2019 Biostat Last Research

    19/40

    The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases

    where a parametric test would be appropriate, non-parametric tests have less power. In other

    words, a larger sample size can be required to draw conclusions with the same degree of

    confidence.

    Non-parametric models

    Non-parametric models differ from parametric models in that the model structure is not

    specified a priori but is instead determined from data. The term non-parametric is not meant to

    imply that such models completely lack parameters but that the number and nature of the

    parameters are flexible and not fixed in advance.

    A histogram is a simple nonparametric estimate of a probability distribution Kernel density estimation provides better estimates of the density than histograms. Nonparametric regression and semiparametric regression methods have been developed

    based on kernels, splines, and wavelets.

    Data Envelopment Analysis provides efficiency coefficients similar to those obtainedby Multivariate Analysis without any distributional assumption.

    Methods

    Non-parametric (ordistribution-free) inferential statistical methods are mathematical

    procedures for statistical hypothesis testing which, unlike parametric statistics, make no

    assumptions about the probability distributions of the variables being assessed. The most

    frequently used tests include

    AndersonDarling test Cochran's Q Cohen's kappa

  • 8/8/2019 Biostat Last Research

    20/40

    Friedman two-way analysis of variance by ranks KaplanMeier Kendall's tau K

    endall's W KolmogorovSmirnov test Kruskal-Wallis one-way analysis of variance by ranks Kuiper's test Logrank Test MannWhitney U or Wilcoxon rank sum test median test Pitman's permutation test Rank products SiegelTukey test Spearman's rank correlation coefficient WaldWolfowitz runs test Wilcoxon signed-rank test.3) Define and determine when it is appropriate to use:a. Z-test

    It is a statistical test where normal distribution is applied and is basically used for dealing

    with problems relating to large samples when n 30.

    There are different types ofZ-test each for different purpose. Some of the popular types are

    outlined below:

    1. z test for single proportion is used to test a hypothesis on a specific value of the populationproportion.

  • 8/8/2019 Biostat Last Research

    21/40

    Statistically speaking, we test the null hypothesis H0: p = p0 against the alternative hypothesis

    H1: p >< p0 where p is the population proportion and p0 is a specific value of the population

    proportion we would like to test for acceptance.

    The example on tea drinkers explained above requires this test. In that example, p0 = 0.5. Notice

    that in this particular example, proportion refers to the proportion of tea drinkers.

    2. z test for difference of proportions is used to test the hypothesis that two populations have thesame proportion.

    For example suppose one is interested to test if there is any significant difference in the habit of

    tea drinking between male and female citizens of a town. In such a situation, Z-test for difference

    of proportions can be applied.

    One would have to obtain two independent samples from the town- one from males and the other

    from females and determine the proportion of tea drinkers in each sample in order to perform this

    test.

    3. z -test for single mean is used to test a hypothesis on a specific value of the population mean.Statistically speaking, we test the null hypothesis H0: = 0 against the alternative hypothesis

    H1: >< 0 where is the population mean and 0 is a specific value of the population that we

    would like to test for acceptance.

    Unlike the t-test for single mean, this test is used if n 30 and population standard deviation is

    known.

    4. z test for single variance is used to test a hypothesis on a specific value of the populationvariance.

    Statistically speaking, we test the null hypothesis H0: = 0 against H1: >< 0 where is the

    population mean and 0 is a specific value of the population variance that we would like to test

    for acceptance.

  • 8/8/2019 Biostat Last Research

    22/40

    In other words, this test enables us to test if the given sample has been drawn from a population

    with specific variance 0. Unlike the chi square test for single variance, this test is used if n 30.

    5. Z-test for testing equality of variance is used to test the hypothesis of equality of two populationvariances when the sample size of each sample is 30 or larger.

    Example:

    n = sample size

    For example suppose a person wants to test if both tea & coffee are equally popular in a

    particular town. Then he can take a sample of size say 500 from the town out of which suppose

    280 are tea drinkers. To test the hypothesis, he can use Z-test.

    Assumption:

    Irrespective of the type ofZ-test used it is assumed that the populations from which the

    samples are drawn are normal.

    b. T-testThe students t test is a statistical method that is used to see if to sets of data differ

    significantly. The method assumes that the results follow the normal distribution (also called

    student's t-distribution) if the null hypothesis is true. This null hypothesis will usually stipulate

    that there is no significant difference between the means of the two data sets.

    It is best used to try and determine whether there is a difference between two independent

    sample groups. For the test to be applicable, the sample groups must be completely independent,

    and it is best used when the sample size is too small to use more advanced methods.

    Before using this type of test it is essential to plot the sample data from he two samples and

    make sure that it has a reasonably normal distribution, or the students t test will not be suitable.

    It is also desirable to randomly assign samples to the groups, wherever possible.

    Restrictions:

  • 8/8/2019 Biostat Last Research

    23/40

    The two sample groups being tested must have a reasonably normal distribution. If the

    distribution is skewed, then the students t test is likely to throw up misleading results. The

    distribution should have only one main peak (= mode) near the mean of the group.

    If the data does not adhere to the above parameters, then either a large data sample is needed or,

    preferably, a more complex form of data analysis should be used.

    Results:

    The students t test can let you know if there is a significant difference in the means of the two

    sample groups and disprove the null hypothesis. Like all statistical tests, it cannot prove anything, as

    there is always a chance of experimental error occurring. But the test can support a hypothesis.

    However, it is still useful for measuring small sample populations and determining if there is a

    significant difference between the groups.

    Example:

    You might be trying to determine if there is a significant difference in test scores between

    two groups of children taught by different methods.

    The null hypothesis might state that there is no significant difference in the mean test scores of

    the two sample groups and that any difference down to chance.

    The students t test can then be used to try and disprove the null hypothesis.

  • 8/8/2019 Biostat Last Research

    24/40

    c. Correlation and RegressionCorrelation Types

    Correlation is a measure of association between two variables. The variables are not designated

    as dependent or independent. The two most popular correlation coefficients are: Spearman's

    correlation coefficient rho and Pearson's product-moment correlation coefficient.

    When calculating a correlation coefficient for ordinal data, select Spearman's technique. For

    interval or ratio-type data, use Pearson's technique.

    The value of a correlation coefficient can vary from minus one to plus one. A minus one

    indicates a perfect negative correlation, while a plus one indicates a perfect positive correlation.

    A correlation of zero means there is no relationship between the two variables. When there is a

    negative correlation between two variables, as the value of one variable increases, the value of

    the other variable decreases, and vise versa. In other words, for a negative correlation, the

    variables work opposite each other. When there is a positive correlation between two variables,

    as the value of one variable increases, the value of the other variable also increases. The

    variables move together.

    The standard error of a correlation coefficient is used to determine the confidence intervals

    around a true correlation of zero. If your correlation coefficient falls outside of this range, then it

    is significantly different than zero. The standard error can be calculated for interval or ratio-type

    data (i.e., only for Pearson's product-moment correlation).

    The significance (probability) of the correlation coefficient is determined from the t-statistic. The

    probability of the t-statistic indicates whether the observed correlation coefficient occurred by

    chance if the true correlation is zero. In other words, it asks if the correlation is significantly

  • 8/8/2019 Biostat Last Research

    25/40

    different than zero. When the t-statistic is calculated for Spearman's rank-difference correlation

    coefficient, there must be at least 30 cases before the t-distribution can be used to determine the

    probability. If there are fewer than 30 cases, you must refer to a special table to find the

    probability of the correlation coefficient.

    Example:

    A company wanted to know if there is a significant relationship between the total number of

    salespeople and the total number of sales. They collect data for five months.

    Variable1

    Variable2

    207 6907

    180 5991

    220 6810

    205 6553

    190 6190

    --------------------------------

    Correlation coefficient = .921

    Standard error of the coefficient = ..068

    t-test for the significance of the coefficient = 4.100

    Degrees of freedom = 3

    Two-tailed probability = .0263

  • 8/8/2019 Biostat Last Research

    26/40

    Another Example:

    Respondents to a survey were asked to judge the quality of a product on a four-point Likert scale

    (excellent, good, fair, poor). They were also asked to judge the reputation of the company that

    made the product on a three-point scale (good, fair, poor). Is there a significant relationship

    between respondents perceptions of the company and their perceptions of quality of the product?

    Since both variables are ordinal, Spearman's method is chosen. The first variable is the rating for

    the quality the product. Responses are coded as 4=excellent, 3=good, 2=fair, and 1=poor. The

    second variable is the perceived reputation of the company and is coded 3=good, 2=fair, and

    1=poor.

    Variable

    1

    Variable

    2

    4 3

    2 2

    1 2

    3 3

    4 3

    1 1

    2 1

    -------------------------------------------

  • 8/8/2019 Biostat Last Research

    27/40

    Correlation coefficient rho = .830

    t-test for the significance of the coefficient = 3.332

    Number of data pairs = 7

    Probability must be determined from a table because of the small sample size.

    Regression

    Simple regression is used to examine the relationship between one dependent and one

    independent variable. After performing an analysis, the regression statistics can be used to

    predict the dependent variable when the independent variable is known. Regression goes beyond

    correlation by adding prediction capabilities.

    People use regression on an intuitive level every day. In business, a well-dressed man is

    thought to be financially successful. A mother knows that more sugar in her children's diet

    results in higher energy levels. The ease of waking up in the morning often depends on how late

    you went to bed the night before. Quantitative regression adds precision by developing a

    mathematical formula that can be used for predictive purposes.

    For example, a medical researcher might want to use body weight (independent variable)

    to predict the most appropriate dose for a new drug (dependent variable). The purpose of running

    the regression is to find a formula that fits the relationship between the two variables. Then you

    can use that formula to predict values for the dependent variable when only the independent

    variable is known. A doctor could prescribe the proper dose based on a person's body weight.

    The regression line (known as the least squares line) is a plot of the expected value of the

    dependent variable for all values of the independent variable. Technically, it is the line that

  • 8/8/2019 Biostat Last Research

    28/40

    "minimizes the squared residuals". The regression line is the one that best fits the data on a

    scatterplot.

    Using the regression equation, the dependent variable may be predicted from the

    independent variable. The slope of the regression line (b) is defined as the rise divided by the

    run. The y intercept (a) is the point on the y axis where the regression line would intercept the y

    axis. The slope and y intercept are incorporated into the regression equation. The intercept is

    usually called the constant, and the slope is referred to as the coefficient. Since the regression

    model is usually not a perfect predictor, there is also an error term in the equation.

    In the regression equation, y is always the dependent variable and x is always the

    independent variable. Here are three equivalent ways to mathematically describe a linear

    regression model.

    y = intercept + (slope x) + error

    y = constant + (coefficient x) + error

    y = a + bx + e

    The significance of the slope of the regression line is determined from the t-statistic. It is

    the probability that the observed correlation coefficient occurred by chance if the true correlation

    is zero. Some researchers prefer to report the F-ratio instead of the t-statistic. The F-ratio is equal

    to the t-statistic squared.

    The t-statistic for the significance of the slope is essentially a test to determine if the

    regression model (equation) is usable. If the slope is significantly different than zero, then we

  • 8/8/2019 Biostat Last Research

    29/40

    can use the regression model to predict the dependent variable for any value of the independent

    variable.

    On the other hand, take an example where the slope is zero. It has no prediction ability

    because for every value of the independent variable, the prediction for the dependent variable

    would be the same. Knowing the value of the independent variable would not improve our ability

    to predict the dependent variable. Thus, if the slope is not significantly different than zero, don't

    use the model to make predictions.

    The coefficient of determination (r-squared) is the square of the correlation coefficient.

    Its value may vary from zero to one. It has the advantage over the correlation coefficient in that it

    may be interpreted directly as the proportion of variance in the dependent variable that can be

    accounted for by the regression equation. For example, an r-squared value of .49 means that 49%

    of the variance in the dependent variable can be explained by the regression equation. The other

    51% is unexplained.

    The standard error of the estimate for regression measures the amount of variability in the points

    around the regression line. It is the standard deviation of the data points as they are distributed

    around the regression line. The standard error of the estimate can be used to develop confidence

    intervals around a prediction.

    Example:

    A company wants to know if there is a significant relationship between its advertising

    expenditures and its sales volume. The independent variable is advertising budget and the

    dependent variable is sales volume. A lag time of one month will be used because sales are

    expected to lag behind actual advertising expenditures. Data was collected for a six month

  • 8/8/2019 Biostat Last Research

    30/40

    period. All figures are in thousands of dollars. Is there a significant relationship between

    advertising budget and sales volume?

    Indep.

    Var.

    Depen.

    Var

    4.2 27.1

    6.1 30.4

    3.9 25.0

    5.7 29.7

    7.3 40.1

    5.9 28.8

    --------------------------------------------------

    Model: y = 10.079 + (3.700 x) + error

    Standard error of the estimate = 2.568

    t-test for the significance of the slope = 4.095

    Degrees of freedom = 4

    Two-tailed probability = .0149

    r-squared = .807

    You might make a statement in a report like this: A simple linear regression was performed

    on six months of data to determine if there was a significant relationship between advertising

    expenditures and sales volume. The t-statistic for the slope was significant at the .05 critical

    alpha level, t(4)=4.10, p=.015. Thus, we reject the null hypothesis and conclude that there was a

  • 8/8/2019 Biostat Last Research

    31/40

    positive significant relationship between advertising expenditures and sales volume.

    Furthermore, 80.7% of the variability in sales volume could be explained by advertising

    expenditures.

    d. Analysis of Variance

    An important technique for analyzing the effect of categorical factors on a response is to

    perform an Analysis of Variance. An ANOVA decomposes the variability in the response

    variable amongst the different factors. Depending upon the type of analysis, it may be important

    to determine: (a) which factors have a significant effect on the response, and/or (b) how much of

    the variability in the response variable is attributable to each factor.

    STATGRAPHICS Centurion provides several procedures for performing an analysis of variance:

    1. One-Way ANOVA - used when there is only a single categorical factor. This is equivalent to

    comparing multiple groups of data.

    2. Multifactor ANOVA - used when there is more than one categorical factor, arranged in a

    crossed pattern. When factors are crossed, the levels of one factor appear at more than one level

    of the other factors.

    3. Variance Components Analysis - used when there are multiple factors, arranged in a

    hierarchical manner. In such a design, each factor is nested in the factor above it.

    4. General LinearModels - used whenever there are both crossed and nested factors, when some

    factors are fixed and some are random, and when both categorical and quantitative factors are

    present.

    One-Way ANOVA

  • 8/8/2019 Biostat Last Research

    32/40

    A one-way analysis of variance is used when the data are divided into groups according

    to only one factor. The questions of interest are usually: (a) Is there a significant difference

    between the groups?, and (b) If so, which groups are significantly different from which others?

    Statistical tests are provided to compare group means, group medians, and group standard

    deviations. When comparing means, multiple range tests are used, the most popular of which is

    Tukey's HSD procedure. For equal size samples, significant group differences can be determined

    by examining the means plot and identifying those intervals that do not overlap.

    Multifactor ANOVA

    When more than one factor is present and the factors are crossed, a multifactor ANOVA

    is appropriate. Both main effects and interactions between the factors may be estimated. The

    output includes an ANOVA table and a new graphical ANOVA from the latest edition of

    Statistics for Experimenters by Box, Hunter and Hunter (Wiley, 2005). In a graphical ANOVA,

    the points are scaled so that any levels that differ by more than exhibited in the distribution of the

    residuals are significantly different.

    Variance Components Analysis

    A Variance Components Analysis is most commonly used to determine the level at which

    variability is being introduced into a product. A typical experiment might select several batches,

    several samples from each batch, and then run replicates tests on each sample. The goal is to

    determine the relative percentages of the overall process variability that is being introduced at

    each level.

    General Linear Model

  • 8/8/2019 Biostat Last Research

    33/40

    The General LinearModels procedure is used whenever the above procedures are not

    appropriate. It can be used for models with both crossed and nested factors, models in which one

    or more of the variables is random rather than fixed, and when quantitative factors are to be

    combined with categorical ones. Designs that can be analyzed with the GLM procedure include

    partially nested designs, repeated measures experiments, split plots, and many others. For

    example, pages 536-540 of the book Design and Analysis of Experiments (sixth edition) by

    Douglas Montgomery (Wiley, 2005) contains an example of an experimental design with both

    crossed and nested factors. For that data, the GLM procedure produces several important tables,

    including estimates of the variance components for the random factors.

    e. Chi-Square TestAny statistical test that uses the chi square distribution can be called chi square test. It is

    applicable both for large and small samples-depending on the context.

    There are different types of chi square test each for different purpose. Some of the popular

    types are outlined below.

    Chi square test for testing goodness of fitis used to decide whether there is any differencebetween the observed (experimental) value and the expected (theoretical) value.

    For example given a sample, we may like to test if it has been drawn from a normal population.

    This can be tested using chi square goodness of fit procedure.

    Chi square test for independence of two attributes. Suppose N observations are consideredand classified according two characteristics say A and B. We may be interested to test

    whether the two characteristics are independent. In such a case, we can use Chi square test

    for independence of two attributes.

    The example considered above testing for independence of success in the English test vis a vis

    immigrant status is a case fit for analysis using this test.

  • 8/8/2019 Biostat Last Research

    34/40

    Chi square test for single variance is used to test a hypothesis on a specific value of thepopulation variance. Statistically speaking, we test the null hypothesis H0: = 0 against the

    research hypothesis H1: # 0 where is the population mean and 0 is a specific value of

    the population variance that we would like to test for acceptance.

    In other words, this test enables us to test if the given sample has been drawn from a

    population with specific variance 0. This is a small sample test to be used only if sample size is

    less than 30 in general.

    Example:

    For example suppose a person wants to test the hypothesis that success rate in a particular

    English test is similar for indigenous and immigrant students.

    If we take random sample of say size 80 students and measure both

    indigenous/immigrant as well as success/failure status of each of the student, the chi square test

    can be applied to test the hypothesis.

    Assumptions:

    The Chi square test for single variance has an assumption that the population from which

    the sample has been is normal. This normality assumption need not hold for chi square goodness

    of fit test and test for independence of attributes.

    However while implementing these two tests, one has to ensure that expected frequency

    in any cell is not less then 5. If it is so, then it has to be pooled with the preceding or succeeding

    cell so that expected frequency of the pooled cell is at least 5.

    Non-Parametric and Distribution Free:

    It has to be noted that the Chi square goodness of fit test and test for independence of

    attributes depend only on the set of observed and expected frequencies and degrees of freedom.

  • 8/8/2019 Biostat Last Research

    35/40

    These two tests do not need any assumption regarding distribution of the parent population from

    which the samples are taken.

    Since these tests do not involve any population parameters or characteristics, they are

    also termed as non parametric or distribution free tests. An additional important fact on these two

    tests is they are sample size independent and can be used for any sample size as along as the

    assumption on minimum expected cell frequency is met.

    4) Types of Non-Parametric Test

    Basically, there is at least one nonparametric equivalent for each parametric general type of test.

    In general, these tests fall into the following categories:

  • 8/8/2019 Biostat Last Research

    36/40

    y Tests of differences between groups (independent samples);y Tests of differences between variables (dependent samples);y Tests of relationships between variables.

    Differences between independent groups.Usually, when we have two samples that we want to

    compare concerning their mean value for some variable of interest, we would use the t-test for

    independent samples); nonparametric alternatives for this test are the Wald-Wolfowitz runs test, the

    Mann-Whitney U test, and the Kolmogorov-Smirnov two-sample test. If we have multiple groups,

    we would use analysis of variance (see ANOVA/MANOVA; the nonparametric equivalents to this

    method are the Kruskal-Wallis analysis of ranks and the Median test.

    Differences between dependent groups.If we want to compare two variables measured in the same

    sample we would customarily use the t-test for dependent samples (in Basic Statistics for example, if

    we wanted to compare students' math skills at the beginning of the semester with their skills at the

    end of the semester). Nonparametric alternatives to this test are the Sign test and Wilcoxon's matched

    pairs test. If the variables of interest are dichotomous in nature (i.e., "pass" vs. "no pass") then

    McNemar's Chi-square test is appropriate. If there are more than two variables that were measured in

    the same sample, then we would customarily use repeated measures ANOVA. Nonparametric

    alternatives to this method are Friedman's two-way analysis of variance and Cochran Q test (if the

    variable was measured in terms of categories, e.g., "passed" vs. "failed"). Cochran Q is particularly

    useful for measuring changes in frequencies (proportions) across time.

    Relationships between variables.To express a relationship between two variables one usually

    computes the correlation coefficient. Nonparametric equivalents to the standard correlation

    coefficient are Spearman R, Kendall Tau, and coefficient Gamma (see Nonparametric correlations).

    If the two variables of interest are categorical in nature (e.g., "passed" vs. "failed" by "male" vs.

  • 8/8/2019 Biostat Last Research

    37/40

    "female") appropriate nonparametric statistics for testing the relationship between the two variables

    are the Chi-square test, the Phi coefficient, and the Fisher exact test. In addition, a simultaneous test

    for relationships between multiple cases is available: Kendall coefficient of concordance. This test is

    often used for expressing inter-rater agreement among independent judges who are rating (ranking)

    the same stimuli.

    Descriptive statistics.When one's data are not normally distributed, and the measurements at best

    contain rank order information, then computing the standard descriptive statistics (e.g., mean,

    standard deviation) is sometimes not the most informative way to summarize the data. For example,

    in the area of psychometrics it is well known that the rated intensity of a stimulus (e.g., perceived

    brightness of a light) is often a logarithmic function of the actual intensity of the stimulus (brightness

    as measured in objective units of Lux). In this example, the simple mean rating (sum of ratings

    divided by the number of stimuli) is not an adequate summary of the average actual intensity of the

    stimuli. (In this example, one would probably rather compute the geometric mean.) Nonparametrics

    and Distributions will compute a wide variety of measures of location (mean, median, mode, etc.)

    and dispersion (variance, average deviation, quartile range, etc.) to provide the "complete picture" of

    one's data.

    When to UseWhich Method

    It is not easy to give simple advice concerning the use of nonparametric procedures. Each

    nonparametric procedure has its peculiar sensitivities and blind spots. For example, the Kolmogorov-

    Smirnov two-sample test is not only sensitive to differences in the location of distributions (for

    example, differences in means) but is also greatly affected by differences in their shapes. The

    Wilcoxon matched pairs test assumes that one can rank order the magnitude of differences in

    matched observations in a meaningful manner. If this is not the case, one should rather use the Sign

  • 8/8/2019 Biostat Last Research

    38/40

    test. In general, if the result of a study is important (e.g., does a very expensive and painful drug

    therapy help people get better?), then it is always advisable to run different nonparametric tests;

    should discrepancies in the results occur contingent upon which test is used, one should try to

    understand why some tests give different results. On the other hand, nonparametric statistics are less

    statistically powerful (sensitive) than their parametric counterparts, and if it is important to detect

    even small effects (e.g., is this food additive harmful to people?) one should be very careful in the

    choice of a test statistic.

    Large data sets and nonparametric methods.Nonparametric methods are most appropriate when the

    sample sizes are small. When the data set is large (e.g., n > 100) it often makes little sense to use

    nonparametric statistics at all. Elementary Concepts briefly discusses the idea of the central limit

    theorem. In a nutshell, when the samples become very large, then the sample means will follow the

    normal distribution even if the respective variable is not normally distributed in the population, or is

    not measured very well. Thus, parametric methods, which are usually much more sensitive (i.e.,

    have more statistical power) are in most cases appropriate for large samples. However, the tests of

    significance of many of the nonparametric statistics described here are based on asymptotic (large

    sample) theory; therefore, meaningful tests can often not be performed if the sample sizes become

    too small. Please refer to the descriptions of the specific tests to learn more about their power and

    efficiency.

    Nonparametric Correlations

    The following are three types of commonly used nonparametric correlation coefficients

    (Spearman R, Kendall Tau, and Gamma coefficients). Note that the chi-square statistic computed for

    two-way frequency tables, also provides a careful measure of a relation between the two (tabulated)

  • 8/8/2019 Biostat Last Research

    39/40

    variables, and unlike the correlation measures listed below, it can be used for variables that are

    measured on a simple nominal scale.

    Spearman R. Spearman R (Siegel & Castellan, 1988) assumes that the variables under

    consideration were measured on at least an ordinal (rank order) scale, that is, that the individual

    observations can be ranked into two ordered series. Spearman R can be thought of as the regular

    Pearson product moment correlation coefficient, that is, in terms of proportion of variability

    accounted for, except that Spearman R is computed from ranks.

    Kendall tau. Kendall tau is equivalent to Spearman R with regard to the underlying assumptions. It

    is also comparable in terms of its statistical power. However, Spearman R and Kendall tau are

    usually not identical in magnitude because their underlying logic as well as their computational

    formulas are very different. Siegel and Castellan (1988) express the relationship of the two measures

    in terms of the inequality: More importantly, Kendall tau and Spearman R imply different

    interpretations: Spearman R can be thought of as the regular Pearson product moment correlation

    coefficient, that is, in terms of proportion of variability accounted for, except that Spearman R is

    computed from ranks. Kendall tau, on the other hand, represents a probability, that is, it is the

    difference between the probability that in the observed data the two variables are in the same order

    versus the probability that the two variables are in different orders.

    -1 3 * Kendall tau - 2 * Spearman R 1

    Gamma. The Gamma statistic (Siegel & Castellan, 1988) is preferable to Spearman R orKendall

    tau when the data contain many tied observations. In terms of the underlying assumptions, Gamma is

    equivalent to Spearman R orKendall tau; in terms of its interpretation and computation it is more

    similar to Kendall tau than Spearman R. In short, Gamma is also a probability; specifically, it is

    computed as the difference between the probability that the rank ordering of the two variables agree

  • 8/8/2019 Biostat Last Research

    40/40

    minus the probability that they disagree, divided by 1 minus the probability of ties. Thus, Gamma is

    basically equivalent to Kendall tau, except that ties are explicitly taken into account.