Correlation Coefficient for One – Two Samples In Hypothesis Testing.

**Highlights:**

- Correlation.
- Correlation Coefficient.
- Correlation Coefficient For One Sample.
- Correlation Coefficient For Two Sample.

A sample is a subset of a population which is under investigation by a researcher. It can also be seen as that element or part of a population which serves as the researcher’s respondents. When a need arise to check the relationship between variables that are of interest to the researcher, the services of correlation is employed.

Correlation is a statistical based technique used for determining the relationship between two or more variables of interest. This is achieved with the help of correlation coefficient.

Correlation coefficient is a numerical value that shows the degree or magnitude of relationship between two or more variables. It can either be positive or negative. It takes range from -1 through zero (0) to +1. Zero correlation coefficient indicates no relationship between the variables of interest.

Positive correlation coefficient (+1) indicates a near perfect relationship which implies that increase in one variable leads to corresponding increase in the second variable. Negative correlation coefficient (-1) indicate an inverse relationship between the variables, which implies increase in one variable produce a decrease in the second variable. This work will discuss correlation coefficient for one and two samples in hypothesis testing.

**CORRELATION COEFFICIENT FOR ONE SAMPLE**

Here, we simply determine the extent, a relationship exist between two variables using one sample and the sample is meant to be the representative of the population under investigation or study by the researcher.

For example, there may exist a relationship between heights and weights of a group of students, the scores of students in two different subjects are expected to have an interdependence or relationship between them, and we can used the Pearson Product Moment correlation coefficient to determine the extent of that relationship.

## Correlation Coefficient for One – Two Samples In Hypothesis Testing

**TESTING THE CORRELATIION COEFFICIENT OF ONE SAMPLE USING t- TEST METHOD**

- State the hypotheses(null and alternative)
- Choose the appropriate level of significance(0.05)
- Compute the test statistics(t-test)
- Decision rule (rejecting or accepting the null hypotheses) i.e if t- calculated is greater than t -critical, we reject the null hypotheses otherwise we fail to reject.

In determining or testing the correlation coefficient for one sample, we use the t-test formula below

t = r x Sqr(n-2/1-rsqr)

Where, r = correlation coefficient

n = sample size, rsqr adjusted r

n -2 = degree of freedom and 1 – rsqr = error

**DECISION WITH USING THE P-VALUE METHOD**

If p-value is less than the significance level (alpha = 0.05) we reject the null hypotheses and conclude that there is a significant linear relationship between the variables and the correlation coefficient is significantly different from zero (0). If the p-value is greater than the significance level (0.05), we fail to reject the null hypotheses.

**DECISION USING THE CRITICAL VALUE METHOD**

Here we compare the value of t-calculated to the t-critical table. If the t- calculated is greater than the t-critical value, then the null hypothesis is rejected and the alternate hypothesis uphold, otherwise we fail to reject the null hypothesis.

**EXAMPLE**

A teacher collects a sample of 10 students in New York and administered intelligent and spatial reasoning test to them. Decide whether there is a difference in the mean scores of students who took the tests. The scores obtained from the two test is given below

S/No | Intelligent Test | SR Test |

1 | 3 | 6 |

2 | 1 | 2 |

3 | 6 | 9 |

4 | 7 | 10 |

4 | 5 | 1 |

5 | 0 | 3 |

6 | 3 | 4 |

7 | 8 | 9 |

8 | 4 | 1 |

9 | 5 | 8 |

10 | 1 | 2 |

**SOLUTION**

S/No |
Intelligent Test (X) |
SR Test (Y) |
XY |
||

1 | 3 | 6 | 18 | 9 | 36 |

2 | 1 | 2 | 2 | 1 | 4 |

3 | 6 | 9 | 54 | 36 | 81 |

4 | 7 | 10 | 70 | 49 | 100 |

4 | 5 | 1 | 5 | 25 | 1 |

5 | 0 | 3 | 0 | 0 | 9 |

6 | 3 | 4 | 12 | 9 | 16 |

7 | 8 | 9 | 72 | 64 | 81 |

8 | 4 | 1 | 4 | 16 | 1 |

9 | 5 | 8 | 40 | 25 | 64 |

10 | 1 | 2 | 2 | 1 | 4 |

Total |
43 |
55 |
279 |
235 |
397 |

H0: There is no significant difference between the mean scores of students that were administered intelligent and spatial reasoning tests.

**H0:p = 0**

Ha: There is a significant difference between the mean scores of students that were administered intelligent and spatial reasoning tests.

The null hypotheses will be tested at 0.05 levels of significance.

**The appropriate statistical** tool to use is t-test for correlation, because the sample size is less than 30.

t = r x Sqr(n-2/1-rsqr)

First, we find the correlation coefficient between the two tests, this can be achieved using Pearson Product Moment Correlation Technique (PPMC), these because the both variables are measured on a continuous scale.

= 10 x279 – 43 x 55/ sqr((10 x 235 – 43 x43) x (10 x 397 – 55 x 55))

= 0.62

Hence **r** = 0.62.

Using the formula,

t = r x Sqr(n-2/1-rsqr)

Where,

n =10,

r = 0.62,

rsqr = 0.38

t = r x Sqr(n-2/1-rsqr)

= 0.62 x sqr(10-2/1-0.38)

= 0.62x 3.59

= 2.223

Since from the question, r is calculated to be 0.62 and n = 10

The degree of freedom (DF) = n – 2

= 10 – 2

= 8

The critical value table for t at 10 degree of freedom was consulted for alpha = 0.05

t- critical = 2.306

**Decision: **since t-cal (2.223) is less than t-critical (2.306), **we do not reject the null hypothesis**

**Conclusion**: The null hypothesis “there is no significant difference between the mean scores of students that were administered intelligent and spatial reasoning tests” is not rejected which implies that, the scores of students in the two tests are almost the same or similar.

**CORRELATION COEFFICIENTS FOR TWO SAMPLES**

Here we test whether the correlation coefficients of two populations are equal or different based on taking a sample from each population and comparing the correlation coefficients of the samples and we can use the Fisher Z-Transformation to test it.

where, r1= transformed correlation coefficient for sample one

r2 = transformed correlation coefficient for sample two

n1= sample size for the first sample

n2 = sample size for the second sample

**EXAMPLE**

A sample of 63 students from New York is taken comparing boy’s interest in mathematics with the girl’s and the correlation coefficient (rn) for the sample is 0.70. Now is this significantly different from the correlation coefficient (rw of 0.64 for a sample of 40 students from Washington.

**SOLUTION**

**H0: **There is no significant difference between the mean interest rating of New York students and Washington students in mathematics

**Ha: **There is a significant difference between the mean interest rating of New York students and Washington students in mathematics.

The appropriate statistical tool to use here is z-test for correlation because both sample sizes are greater than 30.

The null hypotheses will be tested at 0.05 levels of significance

**Decision Rule**: if Z –calculated is greater than Z-critical, we reject the null hypothesis otherwise we fail to reject it.

Here we have two independent samples from two populations. To determine whether a significant difference exist in the correlation coefficient between them, we first transform the correlation into Z-score to create room for compatibility.

Now applying the Fishers transformation formula

Z = 0.52

So Z-calculated = 0.52

Z-critical at 0.05 alpha level of significance = 1.96

Since Z-cal (0.52) less than Z-critical (1.96), thus we **do not** **reject the null hypothesis**.

**Conclusion**: The null hypothesis “there is no significant difference between the mean interest rating of New York students and Washington students in mathematics” is not rejected and these implies that the interest shown by students from the two states towards mathematics is similar.

Since a sample is a representative of a population, and in order to understand or determine the extent a relationship exists between variables, it is imperative to employ correlation coefficients.

We can carry out one sample testing using the t-test method, p-value method, critical value method and an extended approach to two samples using the Fisher Z-transformation.