Mining expression data to identify robust patterns of age-dependent regulation

In recent years, dozens of large-scale gene expression studies (many of them available through the Gene Aging Nexus) have tracked the transcriptional changes that occur with aging. However, these studies usually identify few genes showing statistically significant changes; worse, there is poor overlap across studies – i.e. genes found to be very significant in one study are often not significant in others.

It’s true that these problems are common to microarray studies of other phenotypes – experimental noise and biological variability make this type of data hard to interpret – but for aging the difficulties seem especially pronounced. Aging is complex and global: it happens in every tissue (and possibly differently in every tissue), at both the cellular and organismal levels, and involves many independent biochemical pathways. On top of that, rates of aging can vary substantially for different individuals in the same species, while within the same individual, transcriptional noise increases with age.

So how can we identify a set of genes that are consistently age-associated? In the latest issue of Bioinformatics, Magalhães et al. (the developers of HAGR) develop a statistical methodology for identifying trends of age-regulation across studies and apply it to a collection of 27 different mammalian microarray studies of aging:

Meta-analysis of age-related gene expression profiles identifies common signatures of aging

Motivation: Numerous microarray studies of aging have been conducted, yet given the noisy nature of gene expression changes with age, elucidating the transcriptional features of aging and how these relate to physiological, biochemical and pathological changes remains a critical problem.
Results: We performed a meta-analysis of age-related gene expression profiles using 27 datasets from mice, rats and humans. Our results reveal several common signatures of aging, including 56 genes consistently overexpressed with age, the most significant of which was APOD, and 17 genes underexpressed with age. We characterized the biological processes associated with these signatures and found that age-related gene expression changes most notably involve an overexpression of inflammation and immune response genes and of genes associated with the lysosome. An underexpression of collagen genes and of genes associated with energy metabolism, particularly mitochondrial genes, as well as alterations in the expression of genes related to apoptosis, cell cycle and cellular senescence biomarkers, were also observed. By employing a new method that emphasizes sensitivity, our work further reveals previously unknown transcriptional changes with age in many genes, processes and functions. We suggest these molecular signatures reflect a combination of degenerative processes but also transcriptional responses to the process of aging. Overall, our results help to understand how transcriptional changes relate to the process of aging and could serve as targets for future studies.

To summarize their basic method: the authors reanalyzed data in each of the 27 microarray studies separately to produce a list of differentially expressed genes for each one. Then, they counted up the number of times a gene was differentially expressed with age in the group of studies, and determined whether that number was significantly larger than what would be expected by chance.

Of the 73 genes they found to be consistently age-regulated, 13 have been previously validated (e.g. by qRT-PCR) – a corroboration that strongly supports the new method. The other 60 genes have yet to be investigated.

A couple of points worth noting:

  • This is the first rigorous, large-scale integration of mammalian aging microarray data
    Mining collections of dozens or even hundreds of gene expression datasets to identify global trends is becoming increasingly popular, especially in cancer research (cancer seems to be the research area that sees the most sophisticated applications of bioinformatics). But for aging – an area where the data are noisier, and there is perhaps an even stronger need for integrative computational approaches – few studies have compared more than a handful of expression datasets at once, and none in mammals. Several studies have compared multiple mammalian microarrays on a smaller scale (e.g. Goertzel et al. investigated the effect of calorie restriction on mouse aging; as part of larger studies, Zahn et al. and Adler et al. compared aging in humans and mice).
  • Their analysis is designed to pick out genes that participate in a general aging program
    The microarray studies used in this meta-analysis span a diverse range of tissues, and even multiple species (human, mouse, and rat), so genes emerge as significant here only if they demonstrate a strong age-associated profile across a range of very different conditions. While this approach will likely fail to identify those genes that are age-regulated only in a single tissue, the advantage is that those genes that do come out of this analysis are likely to be the really interesting ones – components of a common aging program that operates in multiple tissues.

ResearchBlogging.orgde Magalhaes, J., Curado, J., & Church, G. (2009). Meta-analysis of age-related gene expression profiles identifies common signatures of aging Bioinformatics, 25 (7), 875-881 DOI: 10.1093/bioinformatics/btp073



  1. Is there an easy way to determine the location of these genes with respect to the subtelomeric region of chromosomes? I have this nagging suspicion that location is important, and might explain the correlation with shortened telomeres. But as a simple-minded engineer and not a biologist, I don’t know how to use the on-line tools to answer my question.

  2. Databases like Entrez Gene have information about the chromosomal position of every gene, so you could use that to determine the location of interesting genes: basically, if a gene has a location that’s close to 1 (the tip of the short arm) or close to the size of the chromosome (the tip of the long arm), it would be close to the subtelomeric region. You might also try the UCSC Genome Browser for similar information; it has a nice graphical interface and I think it’s set up for programmatic interaction as well.

  3. Cardio Now the, which can take?Online properties for, may be lost.Local argument is, an standard of.Online training will aging, not having complete benefits for both.From Thebes Ancient, references Try to.,

  4. Howdy! This post could not be written any better! Reading through this post reminds me of my old room mate! He always kept talking about this. I will forward this article to him. Pretty sure he will have a good read. Many thanks for sharing!

Comments are closed.