Ellen G. Denny, Pamela S. Soltis, Myla F. J. Aronson, Charles C. Davis, Brian J. Stucky, Pierre Bonnet, J. Mason Heberling, Elizabeth R. Ellwood, Alexis Joly, Hervé Goëau, Titouan Lorieul, Gil Nelson, Laura Brenskelle, Emily K. Meineke, Alexander E. White, Susan J. Mazer, Patrick W. Sweeney, Katelin D. Pearson, Florida State University [Tallahassee] (FSU), WINLAB Rutgers University, Rutgers, The State University of New Jersey [New Brunswick] (RU), Rutgers University System (Rutgers)-Rutgers University System (Rutgers), Botanique et Modélisation de l'Architecture des Plantes et des Végétations (UMR AMAP), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche pour le Développement (IRD [France-Sud])-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Département Systèmes Biologiques (Cirad-BIOS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), University of Florida [Gainesville] (UF), Harvard University [Cambridge], University of Arizona, Natural History Museum of Los Angeles County, Carnegie Museum of Natural History [Pittsburgh], Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), University of California [Santa Barbara] (UCSB), University of California, University of California [Davis] (UC Davis), Florida Museum of Natural History [Gainesville], Peabody Museum of Natural History, Yale University [New Haven], Smithsonian Institution, Harvard University, Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), University of California [Santa Barbara] (UC Santa Barbara), and University of California (UC)
Machine learning (ML) has great potential to drive scientific discovery by harvesting data from images of herbarium specimens—preserved plant material curated in natural history collections—but ML techniques have only recently been applied to this rich resource. ML has particularly strong prospects for the study of plant phenological events such as growth and reproduction. As a major indicator of climate change, driver of ecological processes, and critical determinant of plant fitness, plant phenology is an important frontier for the application of ML techniques for science and society. In the present article, we describe a generalized, modular ML workflow for extracting phenological data from images of herbarium specimens, and we discuss the advantages, limitations, and potential future improvements of this workflow. Strategic research and investment in specimen-based ML methods, along with the aggregation of herbarium specimen data, may give rise to a better understanding of life on Earth.