There has long been a debate as to whether educationalists should be streamed, so that the brighter practitioners should not be held up by the slower pace of their less able colleagues. The contrary view is that educationalists of different levels of ability should be mixed together, so that the clever ones can lead the intellectually impaired to better things. It is not clear where the Institute of Education stands on this important policy matter.
This debate is remarkably similar to the question as to whether children should be streamed in schools. Before all else, do a thought experiment: when you stream children, what result would count as success? Certainly if all streamed children do better than un-streamed children then that would count as a clear win. It would show that “correct pace” teaching was good for all. However, what if bright children race ahead whenever they do not have to wait for their less bright peers? Should that be counted a success, or a partial success, or a failure? The economic and cultural contribution of the brightest minds appears to be considerably greater than that of average citizens, so it might be best to give them a clear run, and settle accounts later with redistributive taxation. On the other hand, if you value the mean value of achievement for the group as a whole, then brighter children should be held back to encourage the others.
On the issue of streaming, Samantha Parsons & Sue Hallam, both at the Institute of Education have written “The impact of streaming on attainment at age seven: evidence from the Millennium Cohort Study” . Oxford Review of Education 24 September 2014. http://www.tandfonline.com/loi/core20
Their work has been prominently reported, which is a good thing. It is based on a very good sample, which is also a good thing. Most citizens will read the newspaper accounts only, so here is the Guardian headline as a guide:
School streaming helps brightest pupils but nobody else, say researchers: Splitting classes by ability undermines efforts to help disadvantaged children, finds research into English primaries
So much for what the public will read and believe to have been proved. What does the actual study reveal?
The Millennium sample a good size, is representative, and there is an increased representation of minority, poor and immigrant groups. The sample is somewhat better than the population averages.The sample studied in the paper was N=2544 of whom 83% were not streamed. The sample size is fine by social science standards, and much better than the modal values in publications, though negligible compared to the 70,000+ in the Deary et al (2007) education paper.
What is less satisfactory is that the authors do their study on the basis of Key Stage 1, when the children are 7. These ratings are done by teachers on the basis of “informal tests”. I do not know if these are actual tests with published characteristics, or just an overall impression. They also have an earlier baseline teacher assessment called Foundation Skills Profile. For children at school in England these assessments are made on the basis of the teacher’s accumulating observations and knowledge of the whole child.
Seven years of age is rather early to come to any conclusions about teaching methods. This is the earliest age, from a psychometric point of view, that we can get an indication whether they have reading problems of any significance. It is also a little hard to believe that 7 year olds have achievements in science. These teacher assessments are somewhat weak, and insensitive to actual differences in ability. I have looked at them in relation to court cases, and would not put too much reliance on them. As a rule of thumb, if you want to know how well teachers teach, do not rely on teacher’s assessments of progress. Use national examinations marked by others.
Now we turn to the crux of the paper: the difference between schools that stream and schools that don’t. We need to know if schools that stream are different from those schools which don’t in terms of parental background, child ability, and other teaching methods. In particular, we need to know if the scholastic achievements of children in the un-streamed schools have the same means and standard deviations as the achievements of the streamed schools. Otherwise the differences between the overall score of un-streamed children and the overall scores of the streamed children may differ for reasons that are not directly due to streaming.
For example, schools which find they have a very broad range of child abilities (large standard deviation) might have to do streaming; schools with a narrower range of abilities (low standard deviation) might not bother. We need to check that a fair comparison is being made.
The results in Fig 1 suggest that those who were streamed (17% of this sample) were duller and more variable than the majority who were un-streamed. Looking within the streamed children, the brightest are only a little above the average of the un-streamed majority. Case proved that streaming is not worth it? Not at all.
This is yet another case when very simple statistics would be a great help. Showing the actual distribution of the Stage 1 total scores for the steamed 17% and the un-streamed 83% would be useful. The streamed children are out-numbered four to one. 222 children were in the ‘top’ stream, 130 in the ‘middle’ stream and 94 in the ‘bottom’ stream. These are reasonable numbers, but hardly substantial ones. We must check that the decision to stream children is not influenced by student heterogeneity. As far as I can see, these checks have not been done.
The authors have done regression analyses so as to predict the Key 1 scores. This potentially obscures the position in that it denies us a clear contrast between the streamed/un-streamed groups. Instead, you have to try to derive these differences from the beta coefficients.
The authors note: Standardised regression coefficients do not directly indicate the effect of a unit change in the outcome, they rather represent change in terms of standard deviations. The predictor with the biggest regression coefficient is the most important predictor of the outcome, regardless of the direction of the relationship.
One little-reported conclusion: The child’s earlier academic performance, as measured by the Foundation Stage Profile (FSP) score, was identified as the most significant predictor of later academic attainment as measured by KS1 performance.
Another little-reported conclusion: Among the family socio-economic characteristics, parental education remained significantly associated with the KS1 outcomes, after controlling for all other variables in the model. Household income appeared to be an independent risk factor for overall KS1 performance, as did lone parenthood for KS1 maths attainment.
Comment: This first conclusion is what Heiner Rindermann found in many international samples: parental education is more important than parental wealth. That raises the possibility that unmeasured genetic factors make a contribution.
Although the authors have not provided what I regard as a proper comparison between schools, they surprisingly say:
These differences have developed over a short period of time, since the children began compulsory schooling. The findings support the divergence hypothesis (e.g. Linchevski & Kutscher, 1998) which is of particular concern given that prior teacher rated ability at age five was taken into account, along with a range of child and family and school factors.
I am not persuaded on the basis of this paper that “these differences have developed” as a consequence of schooling. I will of course check to see what further analyses they may have done. There might be no differences in standard deviations between the two groups, so it may be a moot point.
Under “Implications” they write: The evidence from this and earlier research demonstrates that streaming does not of itself raise attainment for all children (e.g. Barker Lunn, 1970; Ferri, 1971) and widens the gap between low and high attaining pupils. Schools need to take this into account when planning the ability grouping structures that they adopt.
I do not think they can argue that, on the basis of their results. They have already said that the prior measures of the Foundation Stage Profile account for a large part of the variance in children’s attainments. The foundation profile has a large gaping hole in it (see below). They have not fully explored the reasons for the possible differences between the streamed and un-streamed children, such that streaming might be applied where there are wide differences in ability.
What dog did not bark in the night? There are no cognitive ability measures reported. None. Why do so many authors fail to consider that intelligence may be a factor in educational attainment? Why leave this out, when it can be measured quickly, and always accounts for a significant proportion of educational outcomes?
Finally, here is my summary:
Although sample sizes are small and the prior measures of ability are weak, those prior abilities are the best predictors of attainments at age 7, and although we cannot be sure that streamed schools haven’t got a wider range of abilities than un-streamed schools, nonetheless it looks as if streaming does not lift the overall abilities of students.
Snappy headlines are one of my most evident whole-person special skills.