Mathematics

Hypothesis Test for Correlation

A hypothesis test for correlation is a statistical method used to determine if there is a significant linear relationship between two variables. It involves testing the null hypothesis that there is no correlation against the alternative hypothesis that there is a correlation. The test produces a p-value, which indicates the strength of evidence against the null hypothesis.

Written by Perlego with AI-assistance

7 Key excerpts on "Hypothesis Test for Correlation"

Learn about this page

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.

eBook - ePub
Mathematics for Enzyme Reaction Kinetics and Reactor Performance
- F. Xavier Malcata(Author)
- 2020(Publication Date)
- Wiley
  (Publisher)
19 Statistical Hypothesis Testing

A statistical hypothesis is a hypothesis that is testable, on the basis of observing a process modeled by a set of random variables that follow a probability distribution known a priori. Hypothesis testing (or confirmatory data analysis) is a method of statistical inference that resorts to tests of significance to determine the probability that a statement is true, and at what likelihood such a statement may be accepted as true. Usually, two datasets are compared, or data obtained from sampling is compared against synthetic data produced via an idealized model; a working hypothesis is then put forward for the statistical relationship between the data, versus a complementary hypothesis.

Therefore, the basic process of hypothesis testing consists of four sequential steps: (i) formulate the null hypothesis, H0 (commonly stated as if the observations are the result of pure chance), and the alternative hypothesis, H1 (commonly stated as if the observations unfold an actual effect, along with an unavoidable component of variation by chance); (ii) identify an appropriate test statistic that can be used to assess the truth of the null hypothesis (dependent on the nature of the data and of the test); (iii) compute the P‐value, or associated probability that a test statistic at least as significant as the one determined from the sample data would be obtained, assuming that the null hypothesis held true (the smaller said probability, the stronger the evidence against the null hypothesis); and (iv) compare the P‐value with an acceptable significance level, α (or threshold of probability, often 1% or 5%) – if P ≤ α, then the observed effect is statistically significant, so the null hypothesis is ruled out, and the alternative hypothesis is concomitantly accepted as valid.

The probability of rejecting the null hypothesis (H0 , expressed as =) is a typical function of five factors – whether the test is two‐tailed (i.e. H1 is expressed as ≠) or one‐tailed (i.e. H1 is expressed as either < or >), the value of α, the intrinsic variance of the data, the amount of deviation from H0 , and the size of the sample. This rationale is illustrated in Fig. 19.1 . Decision on whether or not to accept the null hypothesis is based on the actual value taken by the test statistic – the distribution of which is specified on the assumption that H0 holds true, and typically follows one of the continuous distributions discussed so far; if said value is very unlikely, then H0 should be rejected and H1 concomitantly accepted. In order to take this decision on a quantitative basis, a probability value α must be chosen a priori – below which H0 becomes sufficiently unlikely to be (safely) rejected; the portion(s) of the area below the probability density function curve accounting for α will be continuous whenever a unilateral test is at stake, or else splitted (equally) in half in the case of a bilateral test – with which is which being determined by the nature of H1
Sign up to read
Learn more about book
eBook - ePub
Correlation and Regression
Applications for Industrial Organizational Psychology and Management
- Philip Bobko(Author)
- 2001(Publication Date)
- SAGE Publications, Inc
  (Publisher)
Chapter I ). In fact, testing Pearson product moment correlations for significance is relatively straightforward, although the procedures depend upon the type of hypothesis being tested.
TESTING A SINGLE CORRELATION AGAINST ZERO
The hypothesis H 0 : ρ = 0, where ρ is the underlying population correlation, is by far the most frequently tested correlational hypothesis. In fact, many researchers run around asking each other, “Was your r significant?” This is researcher shorthand for, “Was your sample value of r significantly different from the hypothesized value of ρ = 0?”or “Assuming the population correlation was 0, was your sample value of r so far away from 0 that it was not likely to have occurred by chance?”

Notationally, let ρ be the value of the population correlation, let r be our old friend the sample correlation, and let n be the sample size. Then, it can be shown that the value has a t -distribution with (n − 2) degrees of freedom. Formally, we have

Note that the value of r is tested by computing a fairly straightforward (but nonlinear) function of r and comparing the result to well-known critical values of Student’s t -distribution.2 For example, suppose the predictive validity of the Scholastic Aptitude Test (SAT) is being questioned. Assume that a researcher uses a simple random sample of n = 122 and obtains a measure of success in college (Y ) as well as previous SAT scores (X ) for each individual. Suppose the resulting correlation is r = .27. Then, Equation III.A gives

The obtained sample value of t = 3.07 is larger than the two-sided, p = .05 critical t -value of 1.980 (see Appendix, Table A.1 ). Therefore, this correlation is “significant.” More correctly stated, we have rejected the null hypothesis (that the underlying value of ρ is 0) in favor of the alternative hypothesis that ρ is nonzero (but you’ll usually just hear the cry, “My r is significant!”). [By the way, the same value of 3.07 is statistically significant at the p
Sign up to read
Learn more about book
eBook - ePub
Research Methods for Public Administrators
- Gary Rassel, Suzanne Leland, Zachary Mohr, Elizabethann O'Sullivan(Authors)
- 2020(Publication Date)
- Routledge
  (Publisher)
Tests of statistical significance allow a researcher to determine the probability that variables related in a random sample are not related in the population. A test of statistical significance cannot indicate that a relationship is strong or important; it may not even indicate if the direction is as hypothesized. Nor does a finding of statistical significance imply that other parts of the research were carried out correctly. The test only makes a statistical statement about the nature of a relationship.
To carry out a test of statistical significance, a researcher

states the null and the research hypotheses

selects an alpha level

selects and computes a test statistic

makes a decision

Figure 12.4 illustrates these steps.

Figure 12.4
Testing Statistical Significance of a Relationship

The null hypothesis postulates that an independent and a dependent variable are not related. For some statistical tests, the null hypothesis may state that the relationship goes in a certain direction or does not exceed a specific value.

In hypothesis testing, a researcher runs the risk of making two errors. First, he may reject a null hypothesis that is actually true. The researcher, then, has “accepted” an untrue research hypothesis. This error is called a Type I error. Second, the researcher may accept a null hypothesis that is untrue. The researcher has failed to accept a true research hypothesis; that is called a Type II error.

In practice, researchers are more likely to make a judgment about a hypothesis based on the associated probability, the sample size, the magnitude of the effect, and the practical consequences of their decisions than they are to decide on a specific alpha level set a priori. The associated probability is the probability of a specific Χ2 or t
Sign up to read
Learn more about book
eBook - ePub
Econometrics
- K. Nirmal Ravi Kumar(Author)
- 2020(Publication Date)
- CRC Press
  (Publisher)
O ) is that:

HO : ρ = 0 (there is no relationship between the two variables X and Y, when all population values are observed) and an alternative hypothesis (HA ) is framed against this as:

HA : Alternative hypothesis could be any one of the following three forms viz., (ρ≠0) or (ρ<0) or (ρ>0). That is, if the researcher has no idea whether or how two variables are related, then a two tailed HA ie., HA : (ρ≠0) is formulated. If the researcher suspects, or has knowledge, that the two variables are negatively related then, HA : (ρ<0) is formulated (ie., one tailed HA ) and if the researcher predicts a positive relationship between the variables, then HA : (ρ>0) is formulated (ie., one tailed HA ).

The test statistic for the hypothesis test is the sample or observed correlation coefficient ‘r’ obtained from a sample. The sampling distribution of ‘r’ is approximated by a ‘t’ distribution with n–2 Degrees of Freedom (df). The formula to compute ‘tcal ’ value is given by:

t =
r
S E ( r )
Equation 2.14

where, r = Sample correlation coefficient, Standard Error (SE) of
‘ r ’ =
1 −
r 2
n − 2
n = number of observations. So, considering the formula of SE (‘r’), we can write:

t =
r
n − 2
1 −
Sign up to read
Learn more about book
eBook - ePub
Choosing and Using Statistics
A Biologist's Guide
- Calvin Dytham(Author)
- 2011(Publication Date)
- Wiley-Blackwell
  (Publisher)
1 It is quite rare to find two variables that are normally distributed and therefore suitable for Pearson’s correlation. Test the data to see if they follow a normal distribution. Consider the alternatives.
2 The statistical significance of correlation is not a good guide to the real significance of the correlation. With large sample sizes the value of r required to achieve statistical significance (i.e. to show that there is some relationship between the two variables) is rather low. It is perhaps better to use the value of r 2 as an indicator of the real significance as this value shows the amount of variation in one variable explained by the other.

An example

A marine biologist working on Adélie penguins (Pygoscelis adeliae) has measured the sizes of birds forming pairs. The measure used is the length of a bone in the leg which is known, from previous studies, to be a good indication of size. It is measured to the nearest 0.1 mm. The null hypothesis is that male size is not correlated with female size. Unfortunately data could only be collected from six pairs.

Pair	Female	Male
1	17.1	16.5
2	18.5	17.4
3	19.7	17.3
4	16.2	16.8
5	21.3	19.5
6	19.6	18.3

It is assumed that both variables are normally distributed. This data set is a little small to test this although a larger sample of the population might be more suitable (see ‘Do frequency distributions differ?’ in the previous chapter, page 72). In this case the null hypothesis is rejected as there is a significant positive correlation between male and female size indicating that there is positive, assortative mating in this species. The value of r is 0.88 and r 2 is 0.77 indicating that 77% of the variation in the size of one sex is explained by the size of the other.

SPSS This is one of the easiest tests to carry out in SPSS. Input the data in two columns and add appropriate column labels. The cases (pairs in the example) do not require a separate column. From the ‘Analyze’ menu, select ‘Correlate’ then ‘Bivariate…’. In the dialogue box move the names of the two variables into the ‘Variables:’ box and make sure that the ‘Pearson’ option and ‘Two-tailed’ are checked. Click ‘OK’.

Learn more about book

eBook - ePub

Quantitative and Statistical Research Methods

From Hypothesis to Results

William E. Martin, Krista D. Bridgmon(Authors)
2012(Publication Date)
Jossey-Bass
(Publisher)

Chapter 2 LOGICAL STEPS OF CONDUCTING QUANTITATIVE RESEARCH: HYPOTHESIS-TESTING PROCESS

LEARNING OBJECTIVES

Understand the logic and purpose of the hypothesis-testing process in scientific research.
Identify the components and application of alternative and null hypotheses.
Examine the meaning of alpha level and commonly used criterion levels of alpha (α) used in research.
Explore the elements used to choose an appropriate statistic for use to test a null hypothesis.
Understand decision rules associated with rejecting and failing to reject a null hypothesis.
Realize the importance of including effect size and confidence interval information to further clarify making a decision concerning the null hypothesis.

The hypothesis-testing process is a logical sequence of steps to conduct the statistical analyses in a quantitative research study. Indeed, hypothesis testing is the most widely used statistical tool in scientific research (Salsburg, 2001, p. 114). However, we must remember that no method, including obtaining statistical results from hypothesis testing, is the absolute final answer to a research problem. As Snedecor and Cochran (1967) stated, “But the basic ideas in statistics assist us in thinking clearly about the problem, provide some guidance about the conditions that must be satisfied if sound inferences are to be made, and enable us to detect many inferences that have not good logical foundation” (p. 3).

HYPOTHESIS-TESTING PROCESS

There are six steps of the hypothesis-testing process that provide the procedure for conducting the statistical analyses used in this book. Descriptions and key concepts are discussed for each hypothesis step.

1. Establish the alternative (research) hypothesis (H a )

alternative (research) hypothesis (Ha )

is a speculative statement about the relations between two or more variables used in a quantitative research study (Kerlinger & Lee, 2000). A researcher initially develops one or more research hypotheses about the direction and expected results of a study. In experimental and quasi-experimental research, the variables stated in an alternative hypothesis reflect the changes in an outcome (dependent variable ) that can be attributed to a cause (independent variable ) (Martin & Bridgmon, 2009). Researchers focusing on the predictive relationships among variables often use the terms predictor variable (independent) and criterion variable

Learn more about book

eBook - ePub

Medical Statistics

A Textbook for the Health Sciences

Stephen J. Walters, Michael J. Campbell, David Machin(Authors)
2020(Publication Date)
Wiley-Blackwell
(Publisher)

Hypothesis Testing, P‐values and Statistical Inference

6.1 Introduction
6.2 The Null Hypothesis
6.3 The Main Steps in Hypothesis Testing
6.4 Using Your P-value to Make a Decision About Whether to Reject, or Not Reject, Your Null Hypothesis
6.5 Statistical Power
6.6 One-sided and Two-sided Tests
6.7 Confidence Intervals (CIs)
6.8 Large Sample Tests for Two Independent Means or Proportions
6.9 Issues with P-values
6.10 Points When Reading the Literature
6.11 Exercises

Summary

The main aim of statistical analysis is to use the information gained from a sample of individuals to make inferences or form judgements about the parameters (e.g. the mean) of a population of interest. This chapter will discuss two of the basic approaches to statistical analysis: estimation (with confidence intervals (CIs )) and hypothesis testing (with P‐values). The concepts of the null hypothesis, statistical significance, the use of statistical tests, P‐values and their relationship to CIs are introduced. The difficulties with the use and mis‐interpretation of P‐values are discussed.

6.1 Introduction

We have seen that, in sampling from a population which can be assumed to have a Normal distribution, the sample mean can be regarded as estimating the corresponding population mean μ. Similarly, s

estimates the population variance, σ

. We therefore describe the distribution of the population with the information given by the sample statistics and s

Learn more about book

Explore more topic indexes

View all