Primer to Analysis of Genomic Data Using R by Cedric Gondro

Primer to Analysis of Genomic Data Using R by Cedric Gondro

Author:Cedric Gondro
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


4.6 Population Genetics

Population genetics deals with heredity in populations and the dynamics of the various forces that result in genetic changes. The field revolves around estimation of allele frequencies and how they change over time as populations respond to evolutionary processes such as selection, genetic drift, mutation, and migration. Population genetics also investigates genetic processes such as recombination, linkage and population stratification, as well as environmental adaptation, speciation, and evolutionary relationships. While previously a largely theoretical discipline, the advances in modern molecular technologies enabled population genetics to become a more applied subject since we now have a handle on the structure and variability in populations at the DNA level. While most of the population genetics principles were initially derived from a theoretical framework, we now can effectively test these theoretical models on real experimental data and, due to its solid theoretical foundations, it has become an important toolkit for genomic analysis. Population genetics is important for conservation and ecology studies, it provides insights into evolution and nature but it is also informative in association studies and genomic prediction (e.g., account for population stratification in a GWAS). Here we will focus on the basic population metrics (selection, diversity, linkage, relationships) and some applications using SNP array data, without delving into theoretical details (a good introductory text for population genetics is [51]).

A couple of notes on using SNP array data for population genetics. First the obvious, you can only do this if there is an array available for the species you are interested in working with. Sequence data is becoming quite cheap however and this will not be an issue anymore. Second, arrays are purposely designed to have high minor allele frequencies, not ideal for overall estimates of diversity but adequate in comparative studies. Third, they are subject to ascertainment bias, i.e., the data source from which the SNP were selected from can affect results. For example, an array may have been based on sequence data from, e.g., two breeds and it will strongly reflect high levels of diversity in these breeds (because the common SNP were selected for the array), whilst in other breeds these SNP may be less common which would suggest less diversity (the representation of the panel is unbalanced). These last two points are quite relevant and can lead to distorted interpretations if not taken into account.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.