Saturday, 1 June 2013

Educational attainment, intelligence, and the relentless cracking of the genetic code

Readers of this blog will be familiar with my views on behavioural scientist’s use of very small samples, from which they draw very large conclusions, in sharply opposing directions, very frequently. This makes for good headlines and weak science. So, it gives me great pleasure to read an article in Science about educational attainment which has a sample size of 101,069 persons, and then promptly checks its findings on a further sample of 25,490 other people. With one bound they propel themselves into the stratosphere of behavioural science: a massive sample of “discovery” and a very large “replication” sample. Psychologists, with some notable exceptions, generally limit themselves to a small “discovery” sample which they treat as if it were the entire universe, and leave replication to others.

GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment. Science Xpress

You may think it churlish of me not to give the authors’ names, but as befits a collaborative study, the author list is the length of a short letter to Nature, and the 175 references are a longer paper in themselves.

The cautious naming of a treasured sample as being one of “discovery” is very wise. We use samples like a net cast into the sea to try to discover what fish are like in all oceans. Our conclusions as we pick through our fish will contain many elements which are characteristic of that particular catch, and not of all other catches. We have to dip the net into another sea at another time in order to better understand what creatures live in the oceans.

As you might suspect, the scientists in this paper are gene hunters. They know that the only way to try to make sense of the genetic code is to hit the problem with massive samples, thus squeezing out random characteristics and homing in on the real causes of variance. They found that three SNPs had genome wide significance as regards educational attainment, and also found that a score derived from all these many hundreds of SNPs, each having a tiny but additive effect, accounted for 2% of educational attainment and 2.5% of cognitive function.

Can we now trumpet “The genes for IQ have been found”? The authors make no such error. With commendable caution they say that these areas of the genome are associated with health, cognitive and central nervous system expression, so they are worth following up, and that their study provides a benchmark for power analyses in social science genetics.

This is a very important study, and sets a high standard for others to follow. What does this mean for the genetics of intelligence?

Criterion heterogeneity is the technical term for the “rubber ruler” effect. Suppose we try to study intelligence by asking every school to name their most able student. Some schools in unfavoured catchment areas will nominate a student who would not be rated outstanding in a slightly better school with brighter students. Suppose we try to do better by measuring the number of years that the student spends getting educated. (We discount those rare students who are too bright to spend much time at school and leave early to found their own companies). As a rule of thumb, the brighter you are, the longer you spend in education, because you go to college, and may then continue to even higher degrees. The authors, no slouches, were wise to this problem, and used the International Standard Classification of Education Scale (1997) to calculate years of education, and whether or not the person went to college. Frankly, this is not all that much use when compared to a standard scale like a national exam with a grade point total, but they had to recruit over several countries and this was the best way to get things on a common metric, however crude.

The subjects were all Caucasians i.e. white and they were most of them about 30 years of age, by which time they should have completed college. 23.1% had a college degree. The authors do not mention this, but since white IQ is 100 just about anywhere in the world, it suggests that those with IQ 109 and above were getting into college (23.1% of the white population have IQs of 109 and above). Depending on your attitudes to further education you may see it as a great thing that persons at that level of intellect are in college, or a waste of money and a dreadful lowering of standards. To me it suggests that “college” covered a wide range of courses. The more demanding colleges recruit from those with IQs of 115 and above (top 16% of the population) and elite colleges require IQ 130 (top 2.2%). This trade-off between intellect and educational quality is depicted in a previous post “Social class and university entrance”.

However, not all is lost. The diligent authors found that the peace loving Swedes had given all their military service conscripts a proper IQ test, and the very same genetic markers did a better job of predicting IQ for this subset, accounting for a princely 2.5% of the variance. So, the continuous measure of intelligence was slightly easier to predict than the lumpy and not so informative educational measures. By means of comparison with other personal characteristics, the same genetic analysis predicted 10% of the variance for height. By means of historical comparison, until the last two years the amount of variance of intelligence which could be explained by genetic analysis was zero.

Even larger samples with IQ measures and further analysis of the genetic code may well increase the intelligence variance accounted for. In all probability there are very many genes which contribute to what we call intelligence, all with slight but useful effects.

The hunt continues.


  1. "10% of the variance for height" seems to me to be reasonably successful. Or is it the case that "Height is just a social construct" and therefore they must be wrong? It's just so hard to keep up with the counterfactuals to which one is expected to make obeisance these days.

  2. Hi James. It's not the case that "a score derived from [three SNPs with genome wide significance] accounted for 2% of educational attainment and 2.5% of cognitive function.

    All SNPs regardless of significance accounted for 2%: The three significant SNPs accounted for 100th of that....just 02%.

    "Education" is a multidimensional aggregation. That aggregation likely reduces power dramatically, as Sophie van der Sluis has shown.

    van der Sluis, S., Verhage, M., Posthuma, D., & Dolan, C. V. (2010). Phenotypic complexity, measurement bias, and poor phenotypic resolution contribute to the missing heritability problem in genetic association studies. PLoS One, 5(11), e13929.


    1. Tim, many thanks. Sloppy writing on my part, so thanks for correcting. van der Sluis very interesting. Aggregation is a problem, but the crudity of the educational measure makes it worse. As to measurement invariance, who actually achieves it? I think this is the fearsome Dolan once again, demanding a purity not yet attained by many psychometricians.