Mining expression data to identify robust patterns of age-dependent regulation
Posted Apr 06 2009 7:10pm
In recent years, dozens of large-scale gene expression studies (many of them available through the Gene Aging Nexus ) have tracked the transcriptional changes that occur with aging. However, these studies usually identify few genes showing statistically significant changes; worse, there is poor overlap across studies – i.e. genes found to be very significant in one study are often not significant in others.
It’s true that these problems are common to microarray studies of other phenotypes – experimental noise and biological variability make this type of data hard to interpret – but for aging the difficulties seem especially pronounced. Aging is complex and global: it happens in every tissue (and possibly differently in every tissue), at both the cellular and organismal levels, and involves many independent biochemical pathways. On top of that, rates of aging can vary substantially for different individuals in the same species, while within the same individual, transcriptional noise increases with age.
So how can we identify a set of genes that are consistently age-associated? In the latest issue of Bioinformatics, Magalhães et al. (the developers of HAGR ) develop a statistical methodology for identifying trends of age-regulation across studies and apply it to a collection of 27 different mammalian microarray studies of aging:
Meta-analysis of age-related gene expression profiles identifies common signatures of aging
Motivation: Numerous microarray studies of aging have been conducted, yet given the noisy nature of gene expression changes with age, elucidating the transcriptional features of aging and how these relate to physiological, biochemical and pathological changes remains a critical problem. Results: We performed a meta-analysis of age-related gene expression profiles using 27 datasets from mice, rats and humans. Our results reveal several common signatures of aging, including 56 genes consistently overexpressed with age, the most significant of which was APOD, and 17 genes underexpressed with age. We characterized the biological processes associated with these signatures and found that age-related gene expression changes most notably involve an overexpression of inflammation and immune response genes and of genes associated with the lysosome. An underexpression of collagen genes and of genes associated with energy metabolism, particularly mitochondrial genes, as well as alterations in the expression of genes related to apoptosis, cell cycle and cellular senescence biomarkers, were also observed. By employing a new method that emphasizes sensitivity, our work further reveals previously unknown transcriptional changes with age in many genes, processes and functions. We suggest these molecular signatures reflect a combination of degenerative processes but also transcriptional responses to the process of aging. Overall, our results help to understand how transcriptional changes relate to the process of aging and could serve as targets for future studies. Availability:http://genomics.senescence.info/uarrays/signatures.html
To summarize their basic method: the authors reanalyzed data in each of the 27 microarray studies separately to produce a list of differentially expressed genes for each one. Then, they counted up the number of times a gene was differentially expressed with age in the group of studies, and determined whether that number was significantly larger than what would be expected by chance.
Of the 73 genes they found to be consistently age-regulated, 13 have been previously validated (e.g. by qRT-PCR) – a corroboration that strongly supports the new method. The other 60 genes have yet to be investigated.
A couple of points worth noting:
This is the first rigorous, large-scale integration of mammalian aging microarray data Mining collections of dozens or even hundreds of gene expression datasets to identify global trends is becoming increasingly popular, especially in cancer research (cancer seems to be the research area that sees the most sophisticated applications of bioinformatics). But for aging – an area where the data are noisier, and there is perhaps an even stronger need for integrative computational approaches – few studies have compared more than a handful of expression datasets at once, and none in mammals. Several studies have compared multiple mammalian microarrays on a smaller scale (e.g. Goertzel et al. investigated the effect of calorie restriction on mouse aging; as part of larger studies, Zahn et al. and Adler et al. compared aging in humans and mice).
Their analysis is designed to pick out genes that participate in a general aging program The microarray studies used in this meta-analysis span a diverse range of tissues, and even multiple species (human, mouse, and rat), so genes emerge as significant here only if they demonstrate a strong age-associated profile across a range of very different conditions. While this approach will likely fail to identify those genes that are age-regulated only in a single tissue, the advantage is that those genes that do come out of this analysis are likely to be the really interesting ones – components of a common aging program that operates in multiple tissues.