Tuesday, 5 November 2013

Bad reporting generates bad arguments


The Sunday Times gets 3.5 million readers a week. Anything mentioned in it has a chance of influencing public understanding and opinion in Britain. Last week they did a piece on Professor Robert Plomin’s recent book “G is for Genes” which focuses on the genetic contribution to scholastic achievement. The newspaper’s coverage was misleading. For example, when the journalist asked Plomin whether he intended to start a school (?) he clearly told them he had no such plans. (He is a geneticist trying to find the genes for intelligence and scholastic ability). When he was asked whether he intended to carry out genetic screening on children Plomin said “no can do”. The story was then run saying Plomin wanted to start a school and carry out genetic screening. Predictably there were anguished reactions from readers. This week there was even a letter from Professor Nick Mackintosh criticising Plomin’s work.

Now, the popular press is rarely the best medium for academic debate. I have asked Professor Plomin about his interview with The Sunday Times, and I have reported here his account of the slanted journalism served on him, apparently the fifth time this has happened. I have also asked Professor Mackintosh if his letter was edited, which may have altered the meaning, in which case to send me the fuller version. In the absence of that further information at the moment, I am responding to his letter as it appeared in the newspaper on Sunday.

Professor Mackintosh made 4 points:

1 “Plomin’s own data implies that genetic differences between pupils account for less than 60% of the variation in GCSE exam results”.

Answer: Yes, 58%.

2 “although IQ scores certainly do predict educational success, there is only a limited correlation between IQ and GCSE results. So environmental differences and factors other than IQ are also very significant.”

Answer: No, the correlation between IQ and GCSE results is not “limited”. The largest recent study (Deary et al., 2007) of over 70,000 English children found correlations of r=0.81 between general intelligence measured at 11 years of age and GCSE scores at age 16. This is an extremely high predictive power (accounting for 64% of the variance). The colossal sample size gives us exceptional confidence in the robustness of the results. By way of comparison, most educational psychology publications have sample sizes of a few hundred, and are far less robust. As further proof of the common sense view that intelligence is involved in academic achievement, we can be even more precise about the impact of intelligence on different subjects. IQ scores on their own accounted for 58.6% of the results in Mathematics, 48% in English and down to 18.1% in Art and Design, that subject being the least intellectually demanding (Deary et al., 2007).

I. J. Deary, S. Strand, P. Smith and C. Fernandes (2007) Intelligence and educational achievement. Intelligence 35, 1, pp13-21. For private study, email the author at the University of Edinburgh and ask for a copy.

The second part of Mackintosh’s comment, about environmental influences, has never been denied by Plomin. If you can explain 58% of the variance from the genetics then there is a residue of 42% that you cannot explain from the genetics. That will be a mixture of family upbringing and school effects, and also of unexplained variance, which does not belong to anyone, but which is usually lumped in under the general category of “environmental” variance.

3 “the pursuit of the genes "for" intelligence has so far proved singularly unsuccessful”

Answer: Partly true.  We do not know the names of the genes which lead to these very strong genetic effects. We know there is an effect and several teams are still researching precisely how this comes about. We now know that genetic influence on complex traits and common disorders is due to many genes of very small effect, which means it will take huge samples to detect them. We have learned not to search for single candidate genes. Very recent research is suggesting some possible results.

4 “A recent report by the US academic Christopher Chabris and his colleagues, based on three studies that tested nearly 10,000 people, failed to replicate any previously reported claim to have identified a gene associated with variation in IQ. Its title was Most Reported Genetic Associations with General Intelligence Are Probably False Positives - that is to say untrue.”

Answer: It is true that they could not replicate previously reported claims. However, they agree with the most recent study they were able to include. Current estimates are that, to do a proper hunt for the genes which may be involved in intelligence, sample sizes of 125,000 are required, and even then it is a very difficult task. If you read the Chabris paper, you find that the authors are well aware of this. Their analysis, like similar critical analyses in other areas of the life sciences, focuses on candidate gene studies that look at a handful of genes thought to be important.  The point they are making is that candidate gene studies do not replicate.  The field has moved on to more systematic genome-wide approaches.


They clearly state that intelligence is highly heritable, as is height. They also go on to distinguish between heritability in total and the search for the specific genes in particular.  The Chabris paper is recently published (Sept 2012) but if you look at the reference list the most recent papers are 2011 and most are 2008. In psychology that would be counted as recent. In genetic research it counts as some time ago, because analytic power is increasing as each new genetic chip comes on to the market. Previously, candidate gene studies genotyped a few candidate genes because they did not have access to genome-wide genotyping arrays. Crucially, Chabris et al. have included the Davies et al. (August 2011) paper which starts out the modern set of results on very large samples. Here are their actual words about the differences between the older work and the most recent study they were able to include:

“The failure thus far to find genes associated with g does not mean that g has no genetic component.Davies et al. (2011) used data from five different genome-wide association studies (GWAS) and failed to identify any individual markers robustly associated with crystalized or fluid intelligence. They then applied a recently developed method (Yang et al., 2010; Visscher et al., 2010) for testing the cumulative effects of all the genotyped SNPs. In essence, this method calculates the overall genetic similarity between each pair of individuals in a sample and then correlates this genetic similarity with phenotypic similarity across all pairs. Following Yang et al. (2010), we dropped one twin per pair, and then estimated all pairwise genetic relationships in the resulting sample. We then dropped individuals whose relatedness exceeded .025, just as in Davies et al. (2011). Davies et al. reported that the ~550,000 SNPs in their data could jointly explain 40% of the variation in crystalized g (N = 3,254) and 51% of the variation in fluid g (N = 3,181). We applied the same procedure to the STR sample from Study 3 and estimated that the ~630,000 SNPs in our data jointly account for 47% of the variance in g (p < .02), confirming the Davies et al. (2011) findings in an independent sample. These and our other results, together with the failure of whole-genome association studies of g to date, are consistent with general intelligence being a highly polygenic trait on which common genetic variants individually have only small effects.”

So, there you have it. Chabris et al. say that the current results are consistent with the points that Plomin and Deary and others have been making!

“consistent with general intelligence being a highly polygenic trait on which common genetic variants individually have only small effects.”

Some other general points:

It is a very minor quibble, but Mackintosh’s description of false positives as “untrue” carries a slight suggestion of deceit. Many detections of prostate cancer are false positives. It would have been better to have said “mistaken”. There is a difference between an untruth and a mistake.

By the way, all these most recent finding may not be replicated. Sample sizes may need to be larger, and the analysis of how genes have their effects will have to get much more sophisticated.

In the current state of gene hunting, it is possible that many putative intelligence genes are false positives. There are about 30.000 genes to look at, and intelligence is probably composed of several hundreds of genes acting in unison. However, even if it proves very difficult to track down these genes for several decades, that long search will not invalidate the long established finding that identical twins are more alike in intelligence than fraternal twins. Heritability estimates hold true even when we do not know the names of the genes that cause them. Although many counter proposals have been put forward, the only explanation which stands up again and again is that something about genetic relatedness brings about this strong similarity in intellect.

In sum, I think that Mackintosh’s letter does not give an even-handed view of Plomin’s views nor of the very most recent research. It may have been generated by the newspaper’s misreporting, but it is always possible to check these things by emailing the author. The letter is not a helpful contribution to a debate which is often debased by poor reporting, selective attention, and polarised opinions.

The irony is that the Chabris et al. paper supports the main thesis made by Plomin. Of course, one has to read it all the way through to find that out. I do not know how many of the 3.5 million readers of The Sunday Times will have done so, but it is probably a small number, unless of course the readers of this blog are also readers of that newspaper. In that case, can you send the link to the relevant journalists?


  1. In my opinion MacKintosh's response that IQ explains less than 60% of the variation in GCSE results is poor, and questions his academic abilities in quantitative analysis. The residual of 42% might well have myriad environmental determinants which are extremely difficult to isolate and hence will have little value for policy makers. By contrast, the expIanatory power of IQ is high and accords with one's every day experience. Indeed, it is quite conceivable that at least some of the 42% residual is indirectly linked to IQ ("externalities"), such as the IQ of fellow pupils and indeed the IQ of teaching staff.

  2. Yes, we need to hold the "environmental" explanation to the same standards required of the genetic explanation.

    1. That's probably the best single sentence I've ever read in regard to this matter & re: what's been going on in the research "community" for the last 25 or 30 years -- well-done. (hear him, hear him, huzzah!)

  3. "correlations of r=0.81 between general intelligence measured at 11 years of age and GCSE scores at age 16": correct me if I'm wrong, but such high correlation coefficients must be almost unknown in Social Science.

    Allow me a quibble too. "There is a difference between an untruth and a mistake." True, but "untruth" is a general term, embracing such types as "mistake" and "lie". Still, I accept your implication that it would have been better if he'd said "mistaken"; better, that is, for the intellectual standard of the debate, not necessarily better for someone with a different purpose.

  4. i'm 58% thru Plomin & Asbury's carefully worded book (one writer branches off into teacher-y platitudes, but the info is useful & it's an especially good synopsis:) sounds like the reaction to it is like to the Bell Curve writ small - even tho the book is very careful to word things as nicely & cautiously as possible (too careful if you ask me! the Bell Curve was similar - both strive to be circumspect & positive where possible & what they get for trying to be nice & being all things to all people is the same knee-jerk point & sputter reaction they would've gotten for being blunt & rude:) Dr. Plomin (an outstanding researcher for decades) came into the field a somewhat liberal young man, & has had to deal with learning the truth & power of genetics. He is clearly a nice person (& brilliant & fastidious researcher) & says things in the most polite way while telling the truth. God help him:)
    Deary's excellent research - that .81 correlation - isn't that between a linear composite of X variables weighted to maximally correlate with a linear composite of Y variables (i know he didn't do it as a canonical correlation, but i think that's essentially what it is - which wrings all the predictive/correlative power possible out of plural X & plural Y, but is probably the best way & right way to do it!)

  5. With the amount of misinformation, misunderstanding, and outright lies that make it into the mainstream press, who needs the internet?

    It annoys me when the press feeds one researcher a malformed quote or summary of another researcher's work, looking for the former to criticize the latter. This the former will quite often do, and doesn't realize until afterwards that he is criticizing a complete strawman.

  6. This recently published study has been taken to prove that no heritable traits will ever be identified (one person even claimed that it made it clear that mental illnesses could not be linked to genes!):


    I cannot see that the study really contradicts what you've explained here at all, even though there may be more than one set of DNA inside a human brain. Have I misunderstood something?

    1. There is nothing in the original paper that could lead one to draw such a conclusion. The "Mosaic Copy Number Variation in Human Neurons" article seems top notch, but all it does is report that the DNA of neuronal cells differ.

      This adds an additional layer of complexity to the "old" problem of finding disease-linked mutations, but does not make finding genetic explanations of disease impossible. Before it was hard to distinguish the (probably great majority of) mutations that do not matter in disease from those that do. Now this will likely become even harder as different cells harbor different mutations.

      To quote the relevant parts of the science-editorial discussing that paper:

      "It is often assumed that genome sequencing will explain disease cases by revealing the causative genetic blemish—the mutation that stands out on a background of otherwise flawless molecular function. But whole-genome analysis shows that dysfunction abounds. Rare and common structural variants, including deletions of long genomic segments, pervade every genome. ... Far from pinpointing single mutations on a background of perfect function, genome sequencing has instead generated its own needle-in-a-haystack problem: distinguishing the variants that truly matter to an illness from the far-larger number of functional variants that are present in every genome. It is now clear that, beyond simple, monogenic disorders, understanding complex disease will require sequencing thousands of genomes and ascertaining the patterns shared among the genomes of many affected individuals."

      "Such mutations could be part of the genetic architecture underlying intellectual disability, developmental delay, and the more severe, syndromic forms of autism—although somatic mutations seem less likely to explain substantial fractions of highly heritable disorders such as schizophrenia and bipolar disorder."

      In the original paper the authors mention that such mutations have been found in the DNA of monozygotic twins. This might partly explain why MZ twins sometimes differ on traits that show moderate heritability (mental disease, sexual orientation etc.) Perhaps some of the environment part of heritability studies can be explained by such random mutations? No-one knows at the moment, but it would be interesting to find out.

    2. Thank you so much for this comment, which certainly clarifies the finding for me.

  7. My reading of this paper is that the authors have found it difficult to sequence DNA from individual neural cells, finding that individual cells appear to differ from the overall average of cells. I cannot judge this paper, but it seems to relate to transcription errors, and I do not know if it has wider implications. Doubt it, but it is worth replicating.