An Illustration of the Complexity of Genetic Contributions to Longevity
Very few genetic variants robustly correlate with longevity across different study populations, and those that do, such as variants of APOE and FOXO3A, have small effects, only visible in the mortality statistics of large numbers of people. This indicates that the genetics of longevity, the way in which variations in metabolism and the response to high levels of age-related cell and tissue damage in later life can produce modestly different mortality rates, is a matter of many thousands of tiny, interacting contributions, very sensitive to environmental factors. It appears ever less likely that there will be any easy, small number of genetic changes that can be made to humans in order to produce significant lengthening of life. Thus the study of genetics and longevity isn't the place to be looking for cost-effective ways to produce radical life extension of decades and more. This paper is one of many recent illustrations of this point; none of the described problems would be anywhere near as much of a challenge if there was a large genetic effect on aging and longevity with simple, narrow origins there to be found. That would stand out from the data much more readily.
The results of many genome-wide association studies (GWAS) of complex traits suffer from a lack of replication. Differences in population genetic structures among study populations are considered to be possible contributors to this problem. One aspect of population structure - the differences in genetic frequencies among subgroups of individuals comprising the population - was traditionally linked with the effects of population stratification. Another one - the presence of linkage disequilibrium (LD) in many parts of the human genome including those that contain causal single-nucleotide polymorphisms (SNPs) - was actively exploited in GWAS of complex traits. Methods of fine mapping following the "discovery" phase are used for evaluating causal SNPs. One could expect that the non-replication problem due to differences in LD patterns among study populations in GWAS would disappear if the detected marker SNP is a causal one, i.e., if it contributes to the variability of a trait. It turns out that the differences in LD levels around a functional SNP may still contribute to the non-replication problem.
The estimated associations in this case depend on whether the detected functional SNP is in LD with another functional SNP, the effects of these SNPs on the trait in the absence of LD (pure effects), and on the level of LD between corresponding SNP loci. This property has important consequences for interpretation of the results of genetic analyses of complex traits. In the presence of LD the estimated effects of a causal SNP may be spurious and may incorrectly characterize the biological relationships between the SNP and the trait. In contrast the pure effect of a given causal SNP estimated in the absence of LD with other such SNPs may correctly characterize the biological connections between the SNP and the trait. Therefore, for example, performing genetic analyses of African populations (that have lower levels of LD patterns for many SNP pairs than populations of European origin) has the potential to reduce bias in the estimated effects of functional SNPs on a trait caused by the presence of LD between functional loci. This condition is, however, not sufficient because of the possible presence of hidden gene/gene interaction effects, gene/environment correlations, and gene/environment interaction effects.
Human lifespan and many other aging, health and longevity related traits are multifactorial phenotypes, that is, they are affected by many genetic and non-genetic factors. The relationships between genes and these phenotypes have special features that distinguish them from other complex traits, influence methods of their genetic analyses, and affect the interpretation of the research results. The genetic variants that influence aging, health, and longevity related traits generate age dependent changes in the population genetic structure, i.e., changes in the frequencies of genetic variants and in the levels of linkage disequilibrium (LD) among them. This feature has important implications for studies focused on the replication of GWAS research findings: independent populations involved in such studies often have different genetic structures, due in part to the differences in the population age distribution at the time of biospecimen collection. As a result, the frequencies of the genetic variants associated with these traits and their LD patterns may differ even if the genetic structures in the corresponding population cohorts were the same at birth.
Detecting statistically significant associations of genetic variants with complex traits is not the end of the genetic analyses. One reason is that the relationship between a detected marker SNP and the complex trait of interest is not, necessarily, a causal one. More often these relationships serve as proxies for the real effect of some unobserved causal SNPs (due to linkage disequilibrium (LD) between the marker and causal SNPs), and, hence, do not have a direct biological effect on the phenotype. To generate insights about the biological mechanisms responsible for the trait's variability one has to identify the causal SNPs responsible for the association signal. To identify such SNPs a number of efficient fine-mapping procedures have been recommended. The main limitation of existing methods is that they seek to identify a single causal variant which is independent of (not in LD with) other causal variants. Since this is not sufficiently realistic, a new approach that allows for efficient detection of multiple causal variants has been proposed. The case where two or more causal SNPs are in LD creates additional problems for interpretation of the results of genetic association studies.
In this paper we show that the estimates of the effects of a causal SNP on lifespan depend on the genetic structure of the population under study (e.g., the level of LD of the SNP with other causal SNPs). Genetic association studies of this trait using data from populations with different LD levels are likely to produce different results. We show that differences in population genetic structures can explain why genetic variants favorable for longevity in one population appear as harmful risk factors in another population. Population structure may also be responsible for the age-specific effects of genetic variants on mortality risk. Differences in genetic structures in distinct populations may be responsible for the low level of replicability of GWAS of human aging, health, and longevity related traits.