Heritability, IQ, & Education
The relative importance of environment
and heredity in determining human traits such as
intelligence and educability has long been debated. For example, studies have
repeatedly measured the difference between the IQ scores of black
and white children in the United States as 15~20 points. Arthur
Jensen (1923-), an educational psychologist at UC
Berkeley, wrote a widely-discussed 1969 article in the
Harvard Educational Review, entitled "How much can we boost IQ
and scholastic achievement?" His conclusion
was: not much. From studies of monozygotic
and dizygotic twins and other data, Jensen estimated the
within-group heritability
(h2) of IQ test scores as h2 ~ 0.7. He concluded
therefore that investment in educational enrichment was
unlikely to affect such performance. Similar arguments were
advanced by Richard Herrnstein and Charles Murray in "The
Bell Curve" (1994), and J.
Phillipe
Rushton (U Western Ontario) in "Race,
Evolution,
& Behavior" (1997),
IQ tests were
originally invented to identify a range of mental acuity
in children in need of special help to perform well in
school: no particular significance was attached to the
exact numerical values. The initial use was thus to direct resources towards
persons who needed them. Use of actual numerical
scores as a mean of ranking intellectual ability first
gained wide use in screening of European immigrants (many
of whom could not understand the language in which the
test was administered), and US army recruits during World
War One (many of whom had never held a pencil). This IQ testing was used
as a means of discriminating among persons, and withholding resources from
those regarded as "inferior." Misuse of these
tests is discussed in Stephen
Jay Gould's "The
Mismeasure of Man", which was revised and
reissued in 1996 in response to The Bell Curve. Amongst other concerns,
1) Unlike measures of
height or weight, or tests of blood phenylalanine or serum
cholesterol levels, it is highly questionable whether
there exists an intrinsic
brain function equivalent to the abstract concept
of "intelligence". Although it is
possible to devise and record scores from IQ tests, the
assumption that attaching a number to an abstract concept
makes it "real"
is the reification
fallacy. Historically, it was assumed that
intellectual ability was directly correlated with brain
size, and craniometric statistics were used to "prove"
that certain social (e.g., criminals) or ethnic groups
(e.g., African or American Blacks) had smaller brains and
were therefore less intelligent. Currently, quantitative
investigations of brain anatomy or neurological functions
such as processing speed encounter the same limitation,
that such measurable phenomena cannot directly estimate
abstract 'intelligence.'
2) IQ tests have been
shown to do a good job at predicting aptitude for educational
activities, as have other test batteries such as
the SAT, MCAT, LSAT, and
GRE (for aptitude
in undergraduate, medical, law, and graduate schools,
respectively). This is not surprising, because success in
such activities is often measured in the same was as "IQ", by means of
standardized tests. IQ test
scores undoubtedly correlate with other performance
indices that are generally regarded as desirable.
3) "Heritability is not the same
as heredity: neither equals inevitability" The explicit or implicit
assumption that genetically-influenced traits are constant
and/or unalterable is scientifically
fallacious. Heritability ( h2) as defined by
geneticists is a specific statistical measure, the
fraction of observed trait
variance
due to genetic variance.
High
heritability of agricultural production traits such as egg
or milk quality mean that is possible to improve such
traits by selective artificial breeding: demonstration of
a high heritability for IQ
test scores means only that it would be possible to
breed selectively for improved performance on IQ tests. It is well-established that within-group
heritability does
not predict between-group
heritability differences. High heritability also does
not mean that a trait cannot be altered by a change in
environment.
4) Data from twin studies
can be used to estimate heritability from the observed
differences between identical and non-identical twins. IQ
scores of the former are more highly correlated than those
of the latter, as predicted by a heritability model.
However, in such data sets it is
extremely difficult to separate the effects of genetic
relatedness ('heritability') from those of
environmental similarity ('familiality'). For example,
similarities of mother tongue and political party
affiliation are highly correlated between parents and
offspring, but are without genetic basis.
Some of classic twin studies are highly
questionable. The most famous proponent of educational
testing in Britain, Sir Cyril Burt
(1883-1971) is now known to have altered a great deal of
his twin data in order to support his preferred
conclusions. Burt's results were used for decades to
channel children through the British educational system by
use of standardized test to determine perceived inborn
merit. More modern twin studies [e.g., the Minnesota
Twin Study] continue to show high correlation
between IQ scores
and degree of relatedness. Twins raised apart are
markedly less similar than those raised together; similarities are sometimes
striking, but
are often overstated.
5) Periodic resurgence of
interest in classification of humans or human groups
as of lesser or greater intrinsic worth based on IQ (or other
"objective" tests) typically corresponds with times of
social conservatism, as in post-World War I America,
Germany in the 1930s, and the Reagen era of the 1980's.
The history of the eugenics
movement is particularly instructive. It has often served
political agendas that promote "us versus them" thinking, and the
allocation of scarce resources away from particular groups, education, and/or other social
programs, on the assumption that environmental
modification can have no effect on "innate" group
differences.