Further Elaboration on the Problem of Controls in the Study of Aging and Longevity
Researchers here discuss a well known problem in mouse studies of aging, the inconsistency in outcomes for many of the modestly age-slowing interventions tested to date. The high cost of life span studies means that there are fewer attempts to replicate results than would be desired, and study sizes tend to be smaller than desired. Researchers have pointed out that differences between studies in the setup of control groups may be a sizable part of the problem, and the authors of this paper agree.
Although lifespan extension remains the gold standard for assessing interventions proposed to impact the biology of aging, there are important limitations to this approach. Our reanalysis of lifespan studies from multiple sources suggests that short lifespans in the control group exaggerate the relative efficacy of putative longevity interventions. Due to the high cost and long timeframes of mouse studies, it is rare that a particular longevity intervention will be independently replicated by multiple groups.
Incorporating many of these suggestions for optimal mouse husbandry and avoiding pitfalls of other lifespan studies, the rigorous National Institute of Aging Interventions Testing Program (ITP) has become a gold-standard for mouse longevity studies. In the ITP, studies are performed on both sexes, with large sample sizes and across three different centers to address idiosyncratic issues of mouse husbandry. Furthermore, the UM-HET3 mice used by the ITP are relatively long-lived compared to most inbred strains and genetically heterogenous, thereby reducing the likelihood that mice die of strain-specific pathologies, a factor that may confound lifespan data.
A majority of compounds tested by the ITP have not been previously published to extend lifespan in mice, thus we lack a "ground truth" for their expected effect size. Notably, however, the ITP has failed to replicate published lifespan extension for several compounds such as metformin, resveratrol, and nicotinamide riboside, raising concerns about the robustness of published mouse longevity data. Although differences in genetic background, age of treatment onset, husbandry, and dosing between the original study and the ITP cohorts may explain replication failures, another potential factor is methodological rigor.
In this manuscript, we reanalyze data from caloric restriction (CR) studies performed in multiple species, the ITP and other large mouse lifespan studies with a particular focus on control lifespan as one potential explanation for inflated effect sizes and lack of replicability. As a solution, we emphasize the importance of long-lived controls in mouse studies which should reach a median lifespan of around 900 ±50 days, or the comparison to appropriate historical controls, and we term this the "900-day rule".
My discussion of this paper, a prior paper from Leon Peshkin of Harvard on the same phenomenon, & especially how this phenomenon should also be applied to human trials is all here:
https://x.com/KarlPfleger/status/1841487160136126603