Back to Search
Start Over
Analysis of the genetic basis of fiber-related traits and flowering time in upland cotton using machine learning.
- Source :
-
TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik [Theor Appl Genet] 2025 Jan 24; Vol. 138 (1), pp. 36. Date of Electronic Publication: 2025 Jan 24. - Publication Year :
- 2025
-
Abstract
- Cotton is an important crop for fiber production, but the genetic basis underlying key agronomic traits, such as fiber quality and flowering days, remains complex. While machine learning (ML) has shown great potential in uncovering the genetic architecture of complex traits in other crops, its application in cotton has been limited. Here, we applied five machine learning models-AdaBoost, Gradient Boosting Regressor, LightGBM, Random Forest, and XGBoost-to identify loci associated with fiber quality and flowering days in cotton. We compared two SNP dataset down-sampling methods for model training and found that selecting SNPs with an Fscale value greater than 0 outperformed randomly selected SNPs in terms of model accuracy. We further performed machine learning quantitative trait loci (mlQTLs) analysis for 13 traits related to fiber quality and flowering days. These mlQTLs were then compared to those identified through genome-wide association studies (GWAS), revealing that the machine learning approach not only confirmed known loci but also identified novel QTLs. Additionally, we evaluated the effect of population size on model accuracy and found that larger population sizes resulted in better predictive performance. Finally, we proposed candidate genes for the identified mlQTLs, including two argonaute 5 proteins, Gh&#95;A09G104100 and Gh&#95;A09G104400, for the FL3/FS2 locus, as well as GhFLA17 and Syntaxin-121 (Gh&#95;D09G143700) for the FSD09&#95;2/FED09&#95;2 locus. Our findings demonstrate the efficacy of machine learning in enhancing the identification of genetic loci in cotton, providing valuable insights for improving cotton breeding strategies.<br />Competing Interests: Declarations. Conflict of interest: The authors declare that they have no conflict of interest. Ethics approval: Not applicable. Consent to participate: Not applicable. Consent for publication: Not applicable.<br /> (© 2025. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)
Details
- Language :
- English
- ISSN :
- 1432-2242
- Volume :
- 138
- Issue :
- 1
- Database :
- MEDLINE
- Journal :
- TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik
- Publication Type :
- Academic Journal
- Accession number :
- 39853381
- Full Text :
- https://doi.org/10.1007/s00122-025-04821-2