1. Disentangling genotype and environment specific latent features for improved trait prediction using a compositional autoencoder
- Author
-
Anirudha Powadi, Talukder Zaki Jubery, Michael C. Tross, James C. Schnable, and Baskar Ganapathysubramanian
- Subjects
hierarchical disentanglement ,latent disentanglement ,plant phenotyping ,days to pollen ,yield ,GxE ,Plant culture ,SB1-1110 - Abstract
In plant breeding and genetics, predictive models traditionally rely on compact representations of high-dimensional data, often using methods like Principal Component Analysis (PCA) and, more recently, Autoencoders (AE). However, these methods do not separate genotype-specific and environment-specific features, limiting their ability to accurately predict traits influenced by both genetic and environmental factors. We hypothesize that disentangling these representations into genotype-specific and environment-specific components can enhance predictive models. To test this, we developed a compositional autoencoder (CAE) that decomposes high-dimensional data into distinct genotype-specific and environment-specific latent features. Our CAE framework employed a hierarchical architecture within an autoencoder to effectively separate these entangled latent features. Applied to a maize diversity panel dataset, the CAE demonstrated superior modeling of environmental influences and out-performs PCA (principal component analysis), PLSR (Partial Least square regression) and vanilla autoencoders by 7 times for ‘Days to Pollen’ trait and 10 times improved predictive performance for ‘Yield’. By disentangling latent features, the CAE provided a powerful tool for precision breeding and genetic research. This work has significantly enhanced trait prediction models, advancing agricultural and biological sciences.
- Published
- 2024
- Full Text
- View/download PDF