Last week I wrote a somewhat critical post about a paper entitled "Social Demographic Change and Autism"1. I suggested that calling a conclusion based on estimated data a "major discovery" might not be warranted.
My primary objection at the time was that if you are basing a conclusion on the differences between identical and fraternal twins that perhaps you should be able to tell which twins are identical and which are fraternal. However, a commenter suggested that perhaps I was being a little too hard on the study, so I spent a little bit of time and gathered some data to see if I had the right or wrong idea. After taking a closer look, I think I had the right idea and I am going to explain why below.
The paper in question used data from the California Department of Developmental Services (DDS) to support the following three conclusions -
First, the estimated heritability of autism has been dramatically overstated. Second, heritability estimates can change over remarkably short periods of time because of increases in germ cell mutations. Third, social demographic change can yield genetic changes that, at the population level, combine to contribute to the increased prevalence of autism.These conclusions are based on two primary observations from the data.
The first observation is that the generally accepted concordance rates of autism in twins are wrong and that there is not a large difference between in concordance between identical and fraternal twins. The second observation is that as the average parental age went up that the concordance among all same sex twins went up as the concordance among opposite sex twins went down. The authors then use these facts to suggest that there is a relationship between parental age and de novo mutations, specifically copy number variations, that could explain these facts. They conclude that increasing parental age could be a large factor in the increased prevalence of autism.
After reading the paper a few times and spending some time gathering and analyzing data, it is my opinion that the data in the paper do not support the conclusions. I am going to start with the idea that copy number variations (CNVs) can lead to an increased risks of autism and work backwards from there.
The first problem I see is that this paper has no direct measurement of any CNVs in the children with autism nor any data showing that they are more common in the older parents included in this study. While in general I would agree that other research has shown that CNVs can vary as a function of parental age, it does not follow that that the specific children included in this study had an unusual amount of CNVs or that these CNVs played a causal role in causing these children's autism.
There have been a large number of studies in the recent past that have looked for CNVs in children with autism and the overall conclusions are mixed. On one hand, you have studies2 that actually looked for CNVs in a sizable group of children with autism and found significant ones in only 7% of the population and on the other3 you have studies that found that CNVs as a whole aren't more common in children with autism, just "rare" ones are. In both cases, the majority of the child had their own (almost) unique mutations that were specific to them and not shared by other children.
So this begs the question of how can these CNVs can be responsible for the increased risk of autism? Even if we accept the facts that they are more common in children with autism and become more common as the parents get older, how can a large number of almost unique mutations lead to the same condition? You can't simply suggest that CNVs are the reason, there has to be an actual process that goes from the CNV to a diagnosis of autism. The paper suggests no such mechanism.
As an aside, CNVs have also been found in other, possibly related, conditions such as schizophrenia4 and ADHD. In one recent study5, CNVs were found in 16% of children with ADHD but were also found in 8% of the typical controls. Based on results like this, I have to wonder how common CNVs are in the general population and how much autism-specific risk can be assigned to an arbitrary CNV.
The next problem with CNVs is the presumed relationship between the type of twin and the presence or absence of a CNV -
Recall that because MZ twins are developed from a single pair of matched egg and sperm cells, any de novo mutations will be found in both twins. In contrast, DZ twins develop from two distinct pairs of egg and sperm cells. Because de novo mutations are rare events, the chance that both DZ twins will share the same de novo mutation is extremely low.The problem with this presumption is that is isn't necessarily true - identical twins do not always share the same CNV. While this idea seems counterintuitive, it has been demonstrated by studies6 and I have personally seen this result (my identical twin daughters have different CNVs). I don't have any hard facts on the flip side of the relationship, that fraternal twins are unlikely to share CNVs, but I have some doubts that that statement is true.
The argument here would come down to what is causing the CNV and when these mutations occur. The authors assume that CNVs are present in parents' egg or sperm but I am not sure that is always the case. If identical twins can have different CNVs then that implies that mutations can happen after fertilization which means that the mutations are not always in the parent's genetic material. And if that is the case, then it should be possible for fraternal twins to both have their own CNVs. The question would just be how often this happens, but I have not seen any studies that looked specifically at this issue.
When you combine the above problems with the fact that there is no direct data on the actual CNVs in the paper, I would suggest that the conclusion that suggests older parents have move CNVs which in turn caused more autism might not be supportable.
Next, lets look at the twin specific findings of this paper. Before I get into the details, I wanted to talk a little bit about the properties of the different types of twins. For the purposes of this discussion, I am going to talk in generalities and gloss over some of the rare types of twins and higher order multiples as these will just confuse the issue. Just be aware that the discussion below isn't the entire picture when it comes to multiple births.
Identical, or monozygotic (MZ), twins are the result of a single fertilized egg splitting into two parts. MZ twins are basically a random event and the rate of MZ twinning does not vary by region, race, maternal age, fertility treatments, or time period. MZ twins are not hereditary and cannot be "passed down" through generations. The rate of MZ twinning is basically a constant worldwide and has been for the measurable past7 - about 1 in 250 pregnancies.
Fraternal, or dizygotic (DZ), twins are the result of multiple eggs being fertilized at once or, as is more common as of late, fertility treatments. The rate of DZ twinning does vary by region, race, maternal age, fertility treatments, and time periods. Natural DZ twinning is hereditary and can be passed down to children (although it is the mother's family history that matters, not the father's). The rate of DZ twinning is not a constant and changes over time and the number of DZ twins have been growing rather rapidly since the 1980s
All types of twins have a substantially greater risk for having prenatal complications or for being born premature than do non-twins8. MZ twins are more likely than DZ twins to have complications because they are more likely to share resources, such as the placenta, in the womb. Also, there are certain prenatal complications that are only happen to MZ twins, such as twin-to-twin transfusion syndrome .
It has been demonstrated that prematurity and other prenatal complications come with an increased risk of autism, so while I don't have any exact figures I think it safe to say that twins of all types would run a greater chance of having autism than non-twins. I would also suggest that MZ twins could have a greater risk of autism than DZ twins if only because of their increased prenatal risks. (Although as a side note, I would also ague for an increased risk based on the fact that they would be more likely to share the same genetic and environmental vulnerabilities that might lead to autism).
So, back to the paper. As I talked about last week, the data in this paper does not distinguish between identical and fraternal twins, so to arrive at the breakdown between the different types of twins the authors had to estimate. The estimate that they used was based on the facts that all opposite sex twins are DZ twins and that the chance of a set of DZ twins being the opposite sex are the same as them being the same sex. Based on those assumptions and data about the total number of twin births in California, they arrived at the result that 55% of all of the same sex twins in the paper's data were MZ twins.
There are a few problems with this method of estimating.
Lets start with the fact that the proportion of same sex twins that are MZ twins is not a constant and changes significantly even over the study period (1992 to 2000). To show this, I used national birth data from a few different sources as well as birth data from California from the CDC9, 10, 11. The following are estimates based on the total number of twins born, the number of births per year, and the fact that MZ twins occur at a constant rate of about 1 in 250.
First, lets look at the breakdown of twin births in the US from 1980 to 2000. As you can see from the chart below, the overall number of sets of twins has been growing rapidly but most of this growth comes from an increased number of DZ and not MZ twins.
Second, consider the following California specific twin data. While I could only find complete birth data from 1995 onwards, I believe this data is sufficient to show the skewing in the proportion of twins. For the following chart, I used the authors' figure for identical twinning in California (4.4 per 1,000) which is slightly higher than the accepted figure (1 in 250). As you can see, the number of DZ twins increased every year while the number of MZ twins decrease slightly as the general birth date decreased.
I also replicated the calculation for the percentage of same sex twins that are identical and that is labeled R in the chart below. As you an see, from 1995 to 2000 the value for R changed from 55.4% to 49.5%. As a side note, if you take the rate of MZ twins to be slightly lower (1 in 250), the value for R changes from 51.7% in 1995 to 46% in 2000. Or in other words, this proportion is very highly sensitive to the actual breakdown of twins and is quite volatile.
The third problem is that the distribution of twins by parental age is also changing over the study period. In general, the total number of births in California from 1995 to 2000 fell but the average age of the mother was going up. This increase in the maternal age was mostly caused by younger women having less children and older women having more. For example, in 1995 there were 478,125 children born to mothers under 35 while 73,920 were born to mothers 35 and older but by 2000, the under 35 crowd had decreased to 446,594 while the older mothers increased to 85,365.
Now, the authors acknowledged this general shift in maternal age in the paper and talked about it. However, what they left out was that the average age for mothers of twins was also going up at the same time and that it was rising faster.
The difference between these two lines are all DZ twins. Remember, the chance of having MZ twins does not vary by age but the chance of DZ twins does (older age leads to a higher risk of DZ twins). As a result, the age of the mothers of MZ twins can only increase the same amount as all mothers, not more as is shown in the chart above.
As a result, not only does the relationship between the number of MZ and DZ twins change with time, it also changes with the mother's age. When you look at mothers older than 35, they will have more DZ twins and proportionately less MZ twins (lower R) than do mothers that are younger.
There are other potential biases other than the ones above, such as race. I did not take the time to chart these as I think the general idea is clear - the relationship between the number of DZ and MZ twins is not simple and cannot be represented as a single number over any significant length of time.
When you combine the above problems you can see that getting the breakdown between MZ and DZ twins is not a simple thing and is related to several factors. None of these factors were controlled for in the study.
Now ignore all of that for the moment and consider the twins that were included in the study. Out of the 56,631 sets of twins born in California between 1992 and 2000, only 503 were included - less than 1% of all of the twins. The authors make the assumption that relationships that hold for the entire population will also hold for this specific subset. If this selection of twins were random, I would agree that the relationships should hold.
But, we are not talking about a random selection, we are talking about a very specific selection - twins were at least one twin had a diagnosis of autism - and that is not a random process. If all types of twin were at the same risk and had the same relation to the other risk factors for autism then the relationship might hold. But we know that the prevalence of autism varies by race, parental age, and with prenatal complications and we know that the breakdown of twins vary by these factors as well. As a result, I think that even if the R value that the authors calculated from the entire population was a valid number, it would not necessarily apply to the group of twins included in the study.
The only way to get at the actual breakdown of twins would have been to do an actual count of the different types of twins. Without this breakdown, one of the major points of the paper - the idea that the heritability of autism is overstated - is completely unsupported by the data.
As a further blow to this conclusion, consider that the authors only included data for twins that both had a diagnosis of autism - not Asperger's, CDD, Retts, or even PDD-NOS. So if one twin had a diagnosis of autism and the other had PDD-NOS, they would have been counted as non-concordant. But as has been shown elsewhere12, twins - even identical ones - are not always concordant for severity. There are many twins were one has diagnosis of autism and the other with PDD-NOS, or one is autism and the other is Asperger's. As a result, the current paper would likely underestimate the concordance for all types of twins.
But lets assume that none of the above applies and that all twins have the identical risk of having at least one twin with autism, that the relationship between MZ and DZ twins is a constant for the study period, and that said relationship could be used for the study population without an issue. There would still be a problem with the breakdown because the concordance of the different types of twins was determined using a "simple linear transformation". That can't work unless you are already assuming that MZ and DZ twins should have a similar breakdown.
Let me give an example. Assume that you have 10 sets of male-male twins of and that you know that 6 of them both have autism. Further assume that you know that 5 of the twins are MZ and 5 are DZ. If you use a simple linear breakdown based on the fact that 50% of the twins are MZ, you could conclude that 3 of the 5 are MZ twins that both have autism and 3 of the 5 are DZ twins that both have autism. But given the same data, you could just as easily concluded that 4 of 5 MZ twins have autism while only 1 of 5 DZ twins do, or anything in between. The difference in the concordance rates would be significant - 60% for both MZ and DZ in the first and 80% for MZ and 20% for DZ in the second. I am not saying that the paper used this exact simple split, but it wasn't much more complex than that.
The point is that a linear transformation is going to skew the results if there is a grouping with low concordance combined with a group of high concordance. The lower group would appear to be much higher that it should be while the higher group would appear to be much lower.
That last point is probably the most important because that is exactly what this paper showed - that DZ have a higher concordance than previously thought and MZ have a much lower concordance. The authors are claiming this result even though almost every other paper on this subject has found something different - that DZ twins have a much lower concordance than do MZ twins. So, I would have to conclude that the breakdown of the data in this study is flawed. Especially considering the fact that the other studies on the subject12 were able to properly split out MZ and DZ twins.
As for the last idea about parental age being associated with a greater risk of autism in same sex twins, that could be explained by other factors. And when it comes down to the numbers, the increase in the average age is rather small (1.5 years, from 30.6 to 32) and only slightly larger than the general increase in maternal age across all children.
As I mentioned above, older mothers in California were becoming more common among all children and especially for DZ twins. In the general population in California, the average maternal age from 1995 to 2000 went from approximately 27.2 to 27.8 while in mothers of twins it went from 29 to 30. So is an increase in parental age of 1.5 years significant when the population as a whole is showing about the same increase?
The other thing I would point out, and this may just be me nitpicking, is that the authors knew that the specific data set they used had already been used to establish a relationship between parental age and risk of autism. One the first page of the paper, the authors talk about how another study found a relationship between parental age and autism and acknowledge that the other study used the same data set as they did. I don't think it can be considered a surprise (or major discovery) if you find a relationship that others have already shown in the same data set.
Whew. If you have managed to read all the way down to here, I have to commend you. You will be receiving a gold jabberwocky sticker as a reward. Now, to wrap this all up.
It is my opinion that the conclusions of the paper aren't supported by the data that is in the study. The breakdown between identical and non-identical twins is likely flawed, the association with parental age could be an artifact of a general trend in the data, the specific type of mutations (CNVs) are relatively rare and unique even when seen in people with autism, and there is no known mechanism for these disparate CNVs to all lead to autism.
All this is not to say that the entire paper is worthless. I actually found this paper to be very interesting. I believe it is likely that a social demographic change could account for some of the increase in autism prevalence and I would agree that the wrong conclusion has been drawn from earlier twin studies. Also, the raw data that is published in the paper actually fits in quite nicely with the data from other recent studies and helps fill out the overall picture (that is a subject for another time).
But, my beliefs about the ideas in the paper notwithstanding, I don't think that the conclusions of this paper are supported by the data included.
1. Liu, K., Zerubavel, N., & Bearman, P. (2010). Social demographic change and autism. Demography, 47(2), 327-43. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/20608100 .
2. Shen, Y., Dies, K. A., Holm, I. A., Bridgemohan, C., Sobeih, M. M., Caronna, E. B., et al. (2010). Clinical Genetic Testing for Patients With Autism Spectrum Disorders. Pediatrics. doi: 10.1542/peds.2009-1684.
3. Pinto, D., Pagnamenta, A. T., Klei, L., Anney, R., Merico, D., Regan, R., et al. (2010). Functional impact of global rare copy number variation in autism spectrum disorders. Nature, 1-5. Nature Publishing Group. doi: 10.1038/nature09146.
5. Williams, N. M., Zaharieva, I., Martin, A., Langley, K., Mantripragada, K., Fossdal, R., et al. (2010). Rare chromosomal deletions and duplications in attention-deficit hyperactivity disorder: a genome-wide analysis. The Lancet. doi: 10.1016/S0140-6736(10)61109-9.
6. Bruder, C. E., Piotrowski, A., Gijsbers, A. A., Andersson, R., Erickson, S., de Ståhl, T. D., et al. (2008). Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles. American journal of human genetics, 82(3), 763-71. doi: 10.1016/j.ajhg.2007.12.011.
7. Bortolus, R., Parazzini, F., Chatenoud, L., Benzi, G., Bianchi, M. M., Marini, a., et al. (1999). The epidemiology of multiple births. Human reproduction update, 5(2), 179-87. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/12517657 .
8. Deutsches Aerzteblatt International (2010, October 8). Risks in multiple pregnancies. ScienceDaily. Retrieved October 9, 2010, from http://www.sciencedaily.com/releases/2010/10/101008105718.htm
9. Martin, J. a., & Park, M. M. (1999). Trends in twin and triplet births: 1980-97. National vital statistics reports : from the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System, 47(24), 1-16. Retrieved from http://www.cdc.gov/nchs/data/nvsr/nvsr47/nvs47_24.pdf
11. United States Department of Health and Human Services (US DHHS), Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), Division of Vital Statistics, Natality public-use data 1995-2002, on CDC WONDER On-line Database, November 2005. Accessed at http://wonder.cdc.gov/natality-v2002.html on Oct 8, 2010
12. Rosenberg, R. E., Law, J. K., Yenokyan, G., McGready, J., Kaufmann, W. E., Law, P. A., et al. (2009). Characteristics and concordance of autism spectrum disorders among 277 twin pairs. Archives of pediatrics & adolescent medicine, 163(10), 907-14. doi: 10.1001/archpediatrics.2009.98.