Sample Size, Effect Size, and Power

An Effect Size is the strength or magnitude of the difference between two sets of data or, in outcome studies, between two time points for the same population. (The degree to which the null hypothesis is false).

In statistical hypothesis testing and power analysis, an effect size is the size of a statistically significant difference; that is, a difference between a mathematical characteristic (often the mean) of a distribution of a dependent variable associated with a specific level of an independent variable and the same characteristic of another distribution defined by a different level of the independent variable. Effect size is a different concept to statistical significance, and it is often relevant to compute an effect size measure when a conventional threshold for statistical significance, such as p < .05, has not been met. You just don't yet have evidence that the effect is real.

For example, humans have two genetic gender identifications - male and female. If aliens were to land on earth, how long would it take for them to realise that, on average, males are taller than females? The answer probably relates to the effect size of the difference in height between men and women. The larger the effect size, the quicker they would suspect that men are taller. If the difference in height between men and women was very slight, and there were quite a few women who were taller than men, then the effect size of the difference in height between men and women would be quite small and it would take quite a while (and much sampling) to notice that men were, on average, taller than women.

In its simplest form, an effect size is the difference between two means divided by the pooled standard deviation for those means (this particular type of effect size analysis is frequently referred to as Cohen's d), thus:

Where: SD = Standard Deviation; g = Group

Different people offer different advice regarding how to interpret the resultant effect size, but the most accepted opinion is that of Cohen (1992) where 0.2 is indicative of a small effect, 0.5 a medium and 0.8 a large effect size.

So in the above example, suppose we had the following figures for men and women's height (actually taken from a UK representative sample of 1000 men and 1000 women):

Men

Mean Height = 1754 mm; Standard Deviation = 70.00 mm

Women

Mean Height = 1620 mm; Standard Deviation = 64.90 mm

The derived effect size (using Cohen's d approximation) would equal 1.99. This is very large indeed and our aliens should have no problem in detecting the trend.

Freely available software (freeware) will compute most effect size statistics for you (e.g., GPower, The Effect Size Generator).