Saturday, 7 December 2013

Representativeness beats size


Humans are various. If you take a representative sample you find that they differ considerably. If you take a highly selective sample you find that they are more similar. Children admitted to non-selective, fully subsidised schools are a pretty good sample of the population. A very few children may fail to attend because of some infirmity, including severe mental backwardness, and a few may be tutored at home, but otherwise, by and large, the whole range of ability will be fairly represented. Population samples, such as the Scottish surveys give a good picture of ability differences. In crude terms, if you set such an unselected class a task, the fastest child will be about five times faster than the slowest. (We know that to be the case, but not exactly why intellects differ to this significant degree). Any representative sample shows us that intelligence matters.  Unrepresentative samples often suggest that it doesn’t.

What happens when we take a “convenience” sample? Most psychology results are based on college students, because they are convenient to study. That means that they tend to be in in upper half of the ability distribution. The more selective the institution the higher the ability and the narrower the range. In such universities ability differences are diminished. Intelligence becomes less visible because high intelligence was a condition of entry. Other differences in personality and attitudes consequently become more apparent.

The curious impression that intelligence “disappears” after you select for it can lead to many misunderstandings. For example, if you set up a survey on the internet which appeals to clever and diligent people, you tend to get a lot of them replying. “Please take this test of memory and concentration so that we can help people with memory problems”. This will recruit a different sort of person than a request *Rank these 10 celebrities by their sex appeal”. For all I know, the latter task may contribute more to human happiness, but I wager the first group will be brighter, more conscientious, and possibly more likely to be open-minded and liberal in their attitudes. Although most people know about Zimbardo’s “Stanford Prison Experiment” which suggested that ordinary people can become sadistic prison guards, fewer people know that in a replication of the recruiting stages of the experiment the personalities of volunteers for “prison” experiments turned out to be more tough minded than those who applied for the more mundane control experiment. Selection happens even when you don’t know it, and blunt instruments can’t always detect it.

These reflections are engendered by stories that never die. As long ago as 2 January I did a dismissive review of yet another paper trumpeting the finding that IQ does not exist. The popular press love those stories.

The same story is still doing the rounds, like a vampire that does not understand that a stake has been driven through its heart. There are three main problems with the paper. 1) Online intelligence tests attract more intelligent people which reduces common variance and boosts group factors 2) Despite that, the researchers still found a g factor but then obscured that fact by forcing a group factor solution on which they based their conclusions ; and 3) They made very large claims from a selective sample and a selective analysis.

I do not want to give more air time to a questionable interpretation of data, but I do want to draw attention to the fact that representativeness is more important than the total number of persons who take part in a study.  If a million persons take an online intelligence test that is an impressive result but it is very probably far less representative than a properly stratified random sample of 2,400 persons required to re-norm an intelligence test. It might be easy to get a million persons to take part in a quiz about Elvis Presley, including giving an opinion as to whether he was actually dead, but it would be a better assessment of public opinion to select 2000 citizens at random. As to the reality of the mortality of The King, it would probably best to consult one person, the coroner, but that is another matter.

Although I do not value large but unrepresentative samples, is it always pointless to study the effects of intelligence when there is a severe restriction on the range of intelligence? No, not if you know what you are doing as a researcher, and you pay due attention to the effects of restriction of range, and conduct several longitudinal studies so as to check on the stability of your results.

One of the most contested courses in the United Kingdom is medicine. Competition is very tough. It attracts bright people who want to help others. I spent 40 years in their company, hoping that psychology would influence and improve their treatment of patients. Does it matter whether medical students are intelligent or not?

One researcher who took an interest in this matter was Prof Chris McManus, who was in the privileged position of having completed his medical education before turning to psychology. (He also got the 2002 Ig Nobel Prize for Medicine for his 1976 Nature paper on scrotal asymmetry).  He tried to work out what were the best predictors of success in medicine. The last time he told me about his results was in the rarefied company of the few lecturers who taught psychology as applied to medicine at the University of London, about three decades ago.

What are the most recent results derived from long term studies of doctors in the United Kingdom? Do they need to be bright and scholastically able, or are other aspects more important? Should we bother about their exam results?

The Academic Backbone: longitudinal continuities in educational achievement from secondary school and medical school to MRCP(UK) and the specialist register in UK medical students and doctors. IC McManus, Katherine Woolf, Jane Dacre, Elisabeth Paice and Chris Dewberry. BMC Medicine 2013 11:242

What McManus has found is that, even in this restricted range of bright persons, there is still an effect of intelligence and educational achievement. Good doctors have to have an academic backbone, and that influences success in their medical careers.

“A-levels correlated somewhat less with undergraduate and post-graduate performance, but there was restriction of range in entrants. General Certificate of Secondary Education (GCSE)/O-level results also predicted undergraduate and post-graduate outcomes, but less so than did A-level results, but there may be incremental
validity for clinical and post-graduate performance. The AH5 (intelligence test) had some significant correlations with outcome, but they were inconsistent. Sex and ethnicity also had predictive effects on measures of educational attainment,
undergraduate, and post-graduate performance. Women performed better in assessments but were less likely to be on the Specialist Register. Non-white participants generally underperformed in undergraduate and post-graduate
assessments, but were equally likely to be on the Specialist Register. There was a suggestion of smaller ethnicity effects in earlier studies.
Conclusions: The existence of the Academic Backbone concept is strongly supported, with attainment at secondary school predicting performance in undergraduate and post-graduate medical assessments, and the effects spanning many years. The Academic Backbone is conceptualized in terms of the development of more
sophisticated underlying structures of knowledge (‘cognitive capital’ and ‘medical capital’). The Academic Backbone provides strong support for using measures of educational attainment, particularly A-levels, in student selection.”

In summary, even when there is a restriction of range in intelligence and scholastic attainment one can show an effect of these variable, and also understand that the restriction of range attenuates that effect. Furthermore, once you have selected for intellect by setting a high bar for entry then the variance accounted for by intelligence in that population will be diminished, but one cannot jump to erroneous conclusions about intelligence no longer being predictive. It is still worth selecting medical student applicants because they are intelligent and have high scholastic achievements.

In summary, it is better to commend a good paper than to lament a weak one.


  1. The incompetent study by Hampshire et al. purportedly refuting general intelligence has been expertly eviscerated by Ashton et al. here and here. Among other things, they did a confirmatory factor analysis of the online IQ data used by Hampshire et al., showing that a hierarchical g solution fits the data better than the non-g solution favored by Hampshire et al. -- even though the data are clearly rather bad, as evidenced by their abnormally low common variance. They also note that the brain scan data show orthogonal factors only because Hampshire et al. assume that the factors are orthogonal, not because of any evidence. Moreover, the brain scan factors are based on within-individual variation rather than between-individual differences, and extrapolating from within-individual data to between-individual data requires heroic assumptions.

    1. Didn't realise they had used within-individual scan data on the brain scans. I simply stopped reading when I realised that n=16

  2. "What happens when we take a “convenience” sample?" If we were cunning, we'd use the sample but then devote some time to trying to identify a population of which it might be a representative sample.

    Alternatively, accept that much of psychology concerns only what American psychology undergraduates say that they think they feel.

  3. I do not understand why such a good journal (well, high-impact at least) accepted such a crummy paper? A self-selected sample is a mortal sin in any study, and especially when measuring an individual differences trait such as intelligence.

    Furthermore, I'd just like to end by pointing out that Jensen predicted some of the findings in this study:

    "However, the fact that g has all the characteristics of a polygenic trait (with a substantial component of nongenetic variance) and is correlated with a number of complexly determined aspects of brain anatomy and physiology, as indicated in Chapter 6, makes it highly probable that g, though unitary at a psychometric level of analysis, is not unitary at a biological level."