There has been quite a bit of comment about the Owen et al study in Nature available online on April 20, 2010. A quick synopsis of the study is that the BBC show Bang Goes the Theory worked with the study authors to provide a test of the hypothesis that commercially available brain training programs transfer to general cognitive abilities. The conclusion was that, despite improvements on the trained tasks, “no evidence was found for transfer effects to untrained tasks, even when those tasks were cognitively closely related.”
The study was conducted through the show’s web site. Of 52,617 participants who registered, approximately 20% (11,430) completed full participation in the study, which consisted of two benchmarking assessments 6 weeks apart with variants of neuropsychological tests and at least two training sessions. People were randomly assigned to one of three groups that were asked to train for about 10 min a day three times a week for the 6-week period, though they could train either more or less frequently. One of the two experimental groups was a “brain training” group that completed tasks including simple arithmetic, finding missing pieces, matching symbols to a target, ordering rotating numbers by numerical value, updating, and memory for items. Most of the training sessions were 90 sec each; the rotating numbers tasks was 3 min. These activities are similar to those used in “edutainment” programs that can be played online or with a handheld device. The other experimental group was trained on reasoning tasks that involved identifying relative weights of objects based on a visual “seesaw”, selecting the “odd” item in a concept formation type task, a task involving thinking through the effects of one action on current and future states, and three planning tasks including drawing a continuous line around a grid while ascertaining that the line will not hinder later moves, a version of the Tower of Hanoi task, and a tile sliding game. The control group spent time answering questions about obscure facts and organizing them chronologically based on any available online resource. Results indicated that the two experimental groups performed better than the control group on only one outcome test of grammatical reasoning; there were no differences between either experimental group and the controls on the remaining test. The experimental groups had improved on the trained tasks but not on the transfer tasks.
Although some news reports suggest that these findings are definitive, there are a number of concerns, many of which have to do with whether the findings have been overgeneralized to all forms of brain training because only a few tests were used. Second, there have been questions raised about the amount of time allocated to training and the issue of testing in the home environment. The study reported no relationship between exposure to training and outcome, suggesting that the amount of time was not critical. However, there may not have been sufficient training in general, and there may be a threshold of exposure to training that may be needed before transfer is observable. Third, there are questions about whether recruitment of participants from a show website that is blatantly skeptical of various claims produced a sample biased against finding positive effects of training on generalized outcomes.
1. Substantial, selective and unexplained dropout rates
There was a substantial dropout rate for the study, as best as we can tell, with 52,617 participants registering for the trial and 11,430 completing pre and post training benchmarking assessments and at least two 10-minute training sessions. If there are no other inclusion criteria, this indicates a substantial dropout rate. There was equal probability of assignment to the experimental and control groups, and it is not indicated how many people actually dropped out from the reasoning/planning group with its study sample size of 4,678. There were higher dropout rates for the “brain training” group of 13% and the control group of 32%. In a clinical trial, such selective and high dropout rates would be considered very problematic. However, on balance, the control group did not score as superior to the experimental groups on the baseline tests.
2- Questionable outcome measurement and interpretation
I add to these questions a serious methodological concern about the measurement of outcome data. Of the four transfer tests, only one (reasoning) was scored as total correct. The other three tests (verbal short term memory), spatial working memory, and paired associates) were scored as span tests (maximum items correct within a trial). Span tests are famously insensitive to change, because the maximum of working memory span is 4 “chunks” of information when rehearsal processes are prevented (see Cowan, 2001). Verhaeghen, Cerella, and Basak (2004) found that approximately 10 hours of training on one working memory span task produced a broadening of span from one to four chunks when rehearsal was prevented in young adults. Thus the use of span as the type of measurement is questionable. Even when span measures potentially allow for rehearsal, as in the Owen study with a standard digit span task, the range of performance is quite narrow. For example, digit span is approximately 7 items +/- 2. Total correct scoring produces a considerably wider range of performance, which allows for “growth”. This can be seen in the mean scores and the error bars of Figure 1 in the Owens et al., (2010) article, by comparing reasoning to the each of the other three outcome measures.
The article indicates that the outcome measures have been used to show effects of drugs on cognitive performance and this is indeed the case, but the cited articles use total correct but not span measures for paired associates and spatial working memory outcomes (e.g. Turner, et al., 2003). Notably, if we evaluate the reasoning measure in the Owen et al. study, there was significant improvement in the 2 training groups compared to the controls, with effect sizes of .17 and .22. These are considered small effect sizes but are not much different than those for the effect size of modafinil on backwards digit span, stop-signal response time, and visual memory in a much smaller sample of 60 adults (Turner et al., 2003). The authors of the modafinil study wrote that the results of their study “suggest that modafinil offers significant potential as a cognitive enhancer, particularly with respect to its effects on planning, accuracy and inhibition” (p. 268). We note that modafinil has a moderate effect size of .52 on spatial planning.
In conclusion, in my opinion, the Owen et al. (2010) study contributes to the literature on computerized brain training, by showing that a substantial number of individuals can be recruited to participate, with a wide range of actual amount of practice, and that transfer as measured did not occur in tasks measured as spans, but did show small effects similar to that of drug effects on the one test measured as number correct. Transfer effects have been observed in studies with older adults as well as younger ones in more controlled research environments; it remains to be seen whether the data collected by the Nature study authors on older adults, which were not included in the published article, will show different results. Obviously, few studies in general have been conducted on the role of automated cognitive training in healthy adults, and more are needed before we can draw final conclusions about its value in tests of transfer from brain training activities. We also note that transfer is assumed to occur in educational environments; enormous sums of money are spent on training young people not just so that they can do well in school, but so that they can lead productive lives afterwards.
– Elizabeth Zelinski, Ph.D., is a Professor of Gerontology and Psychology at the Leonard Davis School of Gerontology. Dr. Zelinski has joint appointments in the Psychology Department, Neurosciences and the Study of Women and Men in Society (SWMS) Programs. Dr. Zelinski graduated summa cum laude from Pace University and received her graduate degrees in psychology, with a specialization in aging, from the University of Southern California. Dr. Zelinski is the principal investigator of the Long Beach Longitudinal Study, and was Co-PI in the IMPACT study.