Menu Close

# Correlation Technique – Phi Coefficient and Point Biserial

Advertisement

Correlation Technique – Phi Coefficient and Point Biserial.

Highlights:

• Definition of correlation
• Correlation coefficient
• Correlation range
• Correlation techniques
• Phi coefficient
• Point Biserial coefficient.

Correlation is a statistical based technique used for establishing relationship between two or more variable. This is usually arrived at with the ais of correlation coefficient.

Correlation coefficient: The degree of association is measured by a correlation coefficient, denoted by r. It is sometimes called Pearson product moment correlation coefficient. The correlation coefficient can either be positive or negative. It ranges from -1 through 0 to 1.

When the correlation coefficient is zero, it implies no relationship between the variables under consideration. When it is 1, it indicates a perfect relationship between the variables. Those variables are completely dependent on each other. When it is -1, it shows an inverse relationship which indicate that there are not dependent on each other.

Advertisement
• When a correlation coefficient is positive, it means as one variable is increasing, the other is increasing as well. E.g study time and student achievement. This implies that the more you study, the greater the achievement you will have in what you are studying.
• If the correlation is negative, it means an inverse relationship exist between the variables. These shows that increase in one variable leads to decrease in the other variable.

However, it is important to note that correlation does not imply causation.

Correlation Range Interpretation

1.) 0 – 0.3 = weak/low correlation

2.) 0.31 – 0.69 = moderate correlation

3.) 0.7 – 1 = high correlation

Advertisement

## Types of Correlation Technique

 Nominal Ordinal Interval/Ratio Nominal Phi coefficient C. coefficient Cramers V Advertisement – – Ordinal Rank Biserial Spearman rank – Interval/Ratio Point Biserial Biserial Pearson Product Momemt Coefficient

Phi Coefficient: This is applied when we have two nominal dichotomous variable or data. Dichotomous means the independent variable has two levels. Like gender(male and female).

Example, if a researcher is interested in finding the relationship between gender and student interest in chemistry and mathematics. Use the data given in table 1 below

Advertisement
 Gender Biology/Chemistry 1 1 1 0 0 0 1 1 0 0 1 1 1 0 0 1 0 1 0 0

Now we code or assign

Female = 1

Male = 0

Biology = 1

Advertisement

Chemistry = 0

This will be represented in a 2 x 2 matrix form.

 Gender Biology(1) Chemistry(0) Total Male (0) 2 A 3 B 5(A+B) Female(1) 3 C 2 D 5(C+D) Total 5 (A+C) 5 (B+D)

##### For Male

The first cell under biology which is 2 is gotten by careful inspection. This is done by counting male student who showed interest in biology. That is 0,1. Go to table 1 and count how many pair of 0, 1 you see and write it down.

The second cell under chemistry which is 3 is gotten through that same procedure. It indicates male student that showed interest in chemistry. That is 0,0, so go to table 1 and count the number of times you see 0,0.

Advertisement

The total is gotten by summing 2 and 3(A+B) = 5

For Female

The first cell under biology for female which is 3 is gotten by careful inspection. This is done by counting female student who showed interest in biology. That is 1,1. Go to table 1 and count how many pair of 1, 1 you see and write it down.

The second cell under chemistry for female which is 2 is gotten through that same procedure. It indicates female student that showed interest in chemistry. That is 1,0, so go to table 1 and count the number of times you see 1,0.

The total is gotten by summing 3 and 2(C+D) =  5

Advertisement

Also for the column, A+C = 5,   B+D = 5.

Applying this formula

Phi =   (BC-AD)/SQRT((A+B)(A+C)(C+D)(B+D)

Where, BC = 33 = 9, AD = 22 = 4, (A+B) = 5, (A+C) = 5,  (C+D) =  5, (B+D) = 5

Phi = (9-4)/SQRT(555*5)

Advertisement

= 0.2

This implies there is a weak or low relationship between gender and students’ interest in biology and chemistry.

### Point biserial correlation coefficient

This is used to determine the relationship between two variables when one of the variablesis measured on a continuous scale(interval/ratio) and the second variable is a nominaldichotomous variable. Like gender. For more information on scales of measurement, click here.

The formula use here is given as

Ypb = (MeanY1 – MeaY0)/SdY * SQRT(P * Q)

Advertisement

Where,

MeanY1 = mean of scores of students who got the item correct

MeanY0 = mean of scores of students who got the item wrong

SdY = standard deviation of the scores of students

P = proportion of students who got the item correct

Advertisement

Q = proportion of students who got the item wrong

Example: Assuming a researcher is interested in finding the relationship between student responses to an item and their achievement in integrated science examination. Given the table below

 S/N Responses(x) Scores(y) 1 1 15 2 1 10 3 0 9 4 1 6 5 1 8 6 0 17 7 0 18 8 0 11 9 1 20 10 0 6

#### Correlation Technique – Phi Coefficient and Point Biserial

Solution

First, we code the responses of students;

Correct response = 1

Advertisement

Wrong response = 0

 S/N Responses(x) Scores(y) 1 1 15 225 2 1 10 100 3 0 9 81 4 1 6 36 5 1 8 64 6 0 17 289 7 0 18 324 8 0 11 121 9 1 20 400 10 0 6 36 Total 1676

MeanY1 = mean of scores of students who got the item correct(1)

So we have  (15+10+6+8+20)/5 = 11.8

Advertisement

MeanY0 = mean of scores of students who got the item wrong(0)

This will be (9+17+18+11+6)/5 = 12.2

SdY = standard deviation of the scores of students = 4.86 i.e using the raw score formula.

P = proportion of students who got the item correct = 5/10 = 0.5

Q = proportion of students who got the item wrong = 5/10 = 0.5

Advertisement

Ypb = (11.8 – 12.2)/4.86 * SQRT(0.5 *0.5)

= -0.041

This shows a negative or inverse relationship between the students’ responses to the items and their overall score in integrated science.

If the correlation is moderate or high and negative, it means students who failed or got the item wrong tends to score higher in the exam. Where as if the correlation is moderate or high and positive, it means students who got the item correctly tends to score lower in the exam.

Advertisement

Advertisement

#### Related Posts

This site uses Akismet to reduce spam. Learn how your comment data is processed.