Aging is Accompanied by a Systemic Downregulation of Long Transcripts
Cells are state machines whose behavior is regulated by the pace of production of specific proteins from their genetic blueprints, a process called gene expression. Feedback loops exist between cell activities, the surrounding environment, signals coming and going, and gene expression. Researchers here examine the first part of the gene expression process, in which RNA transcript molecules are generated, and find that there is an association between the size of these molecules and changes in abundance with age. This suggests that some fundamental part of the machinery of transcription is degraded with age, likely producing dysfunction in a range of cellular behavior. The researchers point the finger at SFPQ, though it might be a little early in the investigation of this effect to say anything with confidence about why it happens and what the root causes might be.
The transcriptome responds rapidly, selectively, strongly, and reproducibly to a wide variety of molecular and physiological insults experienced by an organism. While the transcripts of thousands of genes have been reported to change with age, the magnitude by which most transcripts change is small in comparison with classical examples of gene regulation and there is little consensus among different studies. We hence hypothesize that aging is associated with a hitherto uncharacterized process that affects the transcriptome in a systemic manner. We predict that such a process could integrate heterogenous, and molecularly distinctive, environmental insults to promote phenotypic manifestations of aging.
We use an unsupervised machine learning approach to identify the sources of age-dependent changes in the transcriptome. To this end, we measure and survey the transcriptome of 17 mouse organs from 6 biological replicates at 5 different ages from 4 to 24 months raised under standardized conditions. To identify whether there are universal architectural or regulatory features informative on age-dependent changes, we systematically analyze feature importance across models. The most informative feature to those models is the median length of mature transcript molecules, which is closely followed by the number of transcription factors, the length of the gene, and the median length of the coding sequence. We conclude that during aging, transcript length is the most informative feature.
We report a hitherto unknown phenomenon, a systemic age-dependent length-driven transcriptome imbalance that for older organisms disrupts the homeostatic balance between short and long transcript molecules for mice, rats, killifish, and humans. We also demonstrate that in a mouse model of healthy aging, length-driven transcriptome imbalance correlates with changes in expression of splicing factor proline and glutamine rich (Sfpq), which regulates transcriptional elongation according to gene length. Furthermore, we demonstrate that length-driven transcriptome imbalance can be triggered by environmental hazards and pathogens. Our findings reinforce the picture of aging as a systemic homeostasis breakdown and suggest a promising explanation for why diverse insults affect multiple age-dependent phenotypes in a similar manner.
The more important aspect of this research is their use of "unsupervised machine learning" to come to their conclusions. This is just the beginning for the acceleration in advances that will likely be enables by this approach. There is no theoretical limit for the interrelationships that can be identified by a machine learning agent - it is data constrained.
It may not be the time to call for the development of a "commons" or some sort of pooling of machine learning data, that makes economic sense for all those contributing, but that time is fast approaching.
@Tom Schaefer
Machine learning is just a tool to find interesting correlations. Non supervised learning requires huge amounts of data. It is useful but not miraculous