Back to Search
Start Over
A multivariate statistical approach for the estimation of the ethnic origin of unknown genetic profiles in forensic genetics
- Publication Year :
- 2019
- Publisher :
- Elsevier, 2019.
-
Abstract
- DNA typing and genetic profile data interpretation are among the most relevant topics in forensic science; among other applications, genetic profile's capability to distinguish biogeographic information about population groups, subgroups and affiliations have been largely explored in the last decade. In fact, for investigative and intelligence purposes, it is extremely useful to identify subjects and estimate their biogeographic origins by examining the recovered DNA profiles from evidence on a crime scene. Current approaches for BiogeoGraphic Ancestry (BGA) estimation using STRs profiles are usually based on Bayesian methods, which quantify the evidence in terms of likelihood ratio, supporting or not the hypothesis that a certain profile belongs to a specific ethnic group. The present study provides an alternative approach to the likelihood ratio method that involves multivariate data analysis strategies for the estimation of multiple populations. Starting from the well-known NIST US autosomal STRs dataset involving African-American, Asian, and Caucasian individuals, and moving towards further and more geographically restricted populations (such as Northern Africans vs sub-Saharan Africans, Afghans vs Iraqis and Italians vs Romanians), powerful multivariate techniques such as Sparse and Logistic Principal Component Analysis (SL-PCA), Sparse Partial Least Squares-Discriminant Analysis (sPLS-DA) and Support Vector Machines (SVM) were employed and their discriminating power was also compared. Both sPLS-DA and SVM techniques provided robust classifications, yielding high sensitivity and specificity models capable of discriminating populations on ethnic basis. This application may represent a powerful and dynamic tool for law enforcement agencies whenever a standard autosomal STR profile is obtained from the biological evidence collected at a crime scene or recovered during mass-disaster and missing person investigations.
- Subjects :
- Forensic Genetics
Genetic Markers
0301 basic medicine
Multivariate statistics
Ethnic origin Prediction
Support Vector Machine
Multivariate analysis
Genotype
Population genetics
SVM
Bayesian probability
Population
PLS-DA
Pathology and Forensic Medicine
03 medical and health sciences
0302 clinical medicine
Statistics
Genetics
Humans
Crime scene
Short Tandem Repeats (STRs)
030216 legal & forensic medicine
Ethnic origin
Least-Squares Analysis
Multivariate data analysis
education
Estimation
Principal Component Analysis
education.field_of_study
PCA
Settore BIO/18
Racial Groups
Discriminant Analysis
Genetic Profile
DNA Fingerprinting
Genetics, Population
030104 developmental biology
Geography
Biogeographical ancestry (BGA)
Prediction
Principal component analysis
Microsatellite Repeats
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....38e116c2f05b054f86e2bd84258d73e0