1. Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations
- Author
-
Stefano Biffani, Hubert Pausch, Ezequiel L. Nicolazzi, H. Schwarzenbacher, Filippo Biscarini, Yuri Pirola, Biscarini, F, Schwarzenbacher, H, Pausch, H, Nicolazzi, E, Pirola, Y, and Biffani, S
- Subjects
Male ,0301 basic medicine ,Heterozygote ,Support Vector Machine ,Genotype ,KNN ,Population ,Reproducibility of Result ,SNP genotypes ,Recessive mutations ,Carrier identification ,Lasso-penalised logistic regression ,Support vector machines ,MAG ,Haplotypes ,Cattle ,Genes, Recessive ,Single-nucleotide polymorphism ,Biology ,Polymorphism, Single Nucleotide ,support vector machines ,03 medical and health sciences ,ddc:570 ,Haplotype ,Recessive mutation ,Genetics ,Animals ,SNP ,SNP genotype ,education ,education.field_of_study ,Animal ,Genetic Carrier Screening ,Reproducibility of Results ,Life sciences ,3. Good health ,Algorithm ,030104 developmental biology ,Sample size determination ,Mutation ,Female ,False positive rate ,Brown Swiss ,Algorithms ,Research Article ,Biotechnology - Abstract
Background SNP (single nucleotide polymorphisms) genotype data are increasingly available in cattle populations and, among other things, can be used to predict carriers of specific mutations. It is therefore convenient to have a practical statistical method for the accurate classification of individuals into carriers and non-carriers. In this paper, we compared – through cross-validation– five classification models (Lasso-penalized logistic regression –Lasso, Support Vector Machines with either linear or radial kernel –SVML and SVMR, k-nearest neighbors –KNN, and multi-allelic gene prediction –MAG), for the identification of carriers of the TUBD1 recessive mutation on BTA19 (Bos taurus autosome 19), known to be associated with high calf mortality. A population of 3116 Fleckvieh and 392 Brown Swiss animals genotyped with the 54K SNP-chip was available for the analysis. Results In general, the use of SNP genotypes proved to be very effective for the identification of mutation carriers. The best predictive models were Lasso, SVML and MAG, with an average error rate, respectively, of 0.2 %, 0.4 % and 0.6 % in Fleckvieh, and 1.2 %, 0.9 % and 1.7 % in Brown Swiss. For the three models, the false positive rate was, respectively, 0.1 %, 0.1 % and 0.2 % in Fleckvieh, and 3.0 %, 2.4 % and 1.6 % in Brown Swiss; the false negative rate was 4.4 %, 7.6 %1.0 % in Fleckvieh, and 0.0 %, 0.1% and 0.8 % in Brown Swiss. MAG appeared to be more robust to sample size reduction: with 25 % of the data, the average error rate was 0.7 % and 2.2 % in Fleckvieh and Brown Swiss, compared to 2.1 % and 5.5 % with Lasso, and 2.6 % and 12.0 % with SVML. Conclusions The use of SNP genotypes is a very effective and efficient technique for the identification of mutation carriers in cattle populations. Very few misclassifications were observed, overall and both in the carriers and non-carriers classes. This indicates that this is a very reliable approach for potential applications in cattle breeding. ISSN:1471-2164
- Published
- 2016