The Chi-Square Test for Hardy-Weinberg
Proportions
(1) For a
locus with an A/G SNP polymorphisms, given the
observed genotype counts that total to 200, the f(obs)
of each genotype is (given) / (total). The count of alleles c(alleles)
is then two for each homozygote plus one for
each heterozygote, for example c(A) = (2)(40) + 70 =
150. Noting that the number of alleles is twice the number of
genotypes, the frequency of alleles f(allele) is f(A)
= 150 / 400 = 0.375. Then, f(G) = 250 / 400 = 0.625
or (1 - f(A)) = (1 - 0.375). Calculate
the expected genotype frequencies
from these observed allele frequencies: if p =
f(A) and q = f(G), then p2
= f(AA), 2pq = f(AG),
and q2 = f(GG).
Expected genotype counts in each case are then
(frequency) X (total observed).
The proportional deficiency of heterozygotes (F)
can be calculated here from knowledge of the observed heterozygosity
(0.350) and the expected heterozygosity (calculated
as (2) (0.375)(0.625) = 0.469, thus F = (0.350
- 0.469) / 0.469 = 0.253. [A deficiency of heterozygotes is
also called the Inbreeding Coefficient (F),
if it is attributable to preferential union of similar
gametes, and (or) preferential mating of similar genotypes.
The math of this will be dealt with elsewhere].
Calculate the Chi-Square value (X2)
contributed by each genotype as the difference between
the observed and expected counts, divided
by the expected count, quantity squared. For AA,
(AAobs - AAexp)2
/ (AAexp) = (40 - 28)2 /
(40) = 5.120. The calculation is repeated for each
genotype, and the Chi-Square value for the test is the
sum over all genotypes, in this case 12.606.
To evaluate the statistical significance of this
value, it is necessary to know the number of degrees of
freedom (df) in the experimental data, which is
reported and evaluated along with the result. In general, the
df in any experiment is one less than the number of
categories compared, (n-1). The principle is that, if
you know you looked at n experimental results that
could have fallen into any of three categories a b c,
the value of the first category a can be anything (up
to n), and the value of the second category b
can be anything up to (n-a). Having determined a
and b, the third value is now pre-determined: c =
(n - a - b). So, only two of the
three values are free to vary.
For diploid genotypic data with two alleles A & G
and three genotypes AA, AG, & GG,
this might suggest df = 2. However, from first
principles and on reflection, if we know q = f(G),
then f(A) = (1-q), and f(AA) = (1-q)2,
f(AG) = (2)(1-q)(q) and f(GG) = q2,
so that the expected values of the three genotype
categories are pre-determined by knowledge of either one of
the allele frequencies, which per-determines the other, and
therefore df = 1. In general, for nucleotide or other
allelic data, df = n-1 where n = # of alleles.
(2) The same principles apply to calculations for the C
/ A / T SNP polymorphism. Note that the counts of
alleles will involved three heterozygote classes
each. Note the calculation of heterozygosity can be done
either adding the frequency of the (three) heterozygote
classes directly, or by adding the frequencies of the
(three) homozygotes and subtracting the total from 1. However,
as the number of alleles increases, becomes computationally
more efficient to use the latter calculation
Two further points.
(3) Calculation of
expected genotype counts from frequency hypotheses often
results in the expectation of a 'fractional individual.' If we
were testing for a 3:1 genotypic ratio among 17 individuals,
we cannot expect to see 12.75 and 4.25, so we round to the
closest integer, here 13 and 4, which still adds to 17.
This is applicable to the multiple-category data above. A
related problem arises when, among 18 individuals, calculated
expectations are 13.5 and 4.5: if we round both to 13 and 4,
we are shy one expected and the test calculation is biased.
One convention is to round one or the other expectation up, in
the same trend observed in the data. That is, if we have a 3:1
hypothesis and observe 15 & 3, we round the expectation of
the second class down to 14 & 4. If we
observed 12 & 6, we round the expectation of the second
class up to 13 & 5. This reduces the
possibility of obtaining a significant result due to a
computational bias, and increase reliability of the result.
(4)
Chi-square calculations must only be performed with
count data, never with frequencies or
percentages. Because it squares the magnitude of the
deviation, X2
values are heavily influenced by the absolute magnitude of
the numbers. For 6 & 4 observed versus 5
& 5 expected, X2
= 12 / 5 + (-1)2
/ 5 = 2 / 5 = 0.40 ns, whereas with 60 & 40 versus 50
& 50 expected, X2
= 102 / 50 + (-10)2
/ 50 = 200 / 50 = 4.0* , and with 600 & 400
versus 500 & 500 expected X2 =
1002 / 500 + (-100)2 / 500 = 20,000 /
500 = 40.0***. The proportional deviation is the same
in each case (20%), but when the actual deviation is
squared, and contributes much for strongly to X2
as n increases. This is also a reminder that larger
samples sizes produce more sensitive experiments. [HOMEWORK: to prove this point
further, calculate Chi-square for 0.6:0.4 observed versus
0.5:0.5 expected. Do you see a pattern?]
Text material
© 2019 by Steven M. Carr