Michel S, Naslavsky, Marilia O, Scliar, Guilherme L, Yamamoto, Jaqueline Yu Ting, Wang, Stepanka, Zverinova, Tatiana, Karp, Kelly, Nunes, José Ricardo Magliocco, Ceroni, Diego Lima, de Carvalho, Carlos Eduardo, da Silva Simões, Daniel, Bozoklian, Ricardo, Nonaka, Nayane, Dos Santos Brito Silva, Andreia, da Silva Souza, Heloísa, de Souza Andrade, Marília Rodrigues Silva, Passos, Camila Ferreira Bannwart, Castro, Celso T, Mendes-Junior, Rafael L V, Mercuri, Thiago L A, Miller, Jose Leonel, Buzzo, Fernanda O, Rego, Nathalia M, Araújo, Wagner C S, Magalhães, Regina Célia, Mingroni-Netto, Victor, Borda, Heinner, Guio, Carlos P, Rojas, Cesar, Sanchez, Omar, Caceres, Michael, Dean, Mauricio L, Barreto, Maria Fernanda, Lima-Costa, Bernardo L, Horta, Eduardo, Tarazona-Santos, Diogo, Meyer, Pedro A F, Galante, Victor, Guryev, Erick C, Castelli, Yeda A O, Duarte, Maria Rita, Passos-Bueno, Mayana, Zatz, Universidade de São Paulo (USP), Hospital Israelita Albert Einstein, Harvard Medical School, Laboratório DASA, University Medical Center Groningen, Universidade Estadual Paulista (UNESP), Hospital Sirio-Libanes, Universidade Federal de Minas Gerais (UFMG), Instituto Mário Penna, Instituto Nacional de Salud, Universidad de Huánuco, National Cancer Institute, Universidade Federal da Bahia (UFBA), Fundação Oswaldo Cruz, Universidade Federal de Pelotas, and Universidad Peruana Cayetano Heredia
Made available in DSpace on 2022-04-29T08:46:40Z (GMT). No. of bitstreams: 0 Previous issue date: 2022-12-01 National Institute of General Medical Sciences As whole-genome sequencing (WGS) becomes the gold standard tool for studying population genomics and medical applications, data on diverse non-European and admixed individuals are still scarce. Here, we present a high-coverage WGS dataset of 1,171 highly admixed elderly Brazilians from a census-based cohort, providing over 76 million variants, of which ~2 million are absent from large public databases. WGS enables identification of ~2,000 previously undescribed mobile element insertions without previous description, nearly 5 Mb of genomic segments absent from the human genome reference, and over 140 alleles from HLA genes absent from public resources. We reclassify and curate pathogenicity assertions for nearly four hundred variants in genes associated with dominantly-inherited Mendelian disorders and calculate the incidence for selected recessive disorders, demonstrating the clinical usefulness of the present study. Finally, we observe that whole-genome and HLA imputation could be significantly improved compared to available datasets since rare variation represents the largest proportion of input from WGS. These results demonstrate that even smaller sample sizes of underrepresented populations bring relevant data for genomic studies, especially when exploring analyses allowed only by WGS. Human Genome and Stem Cell Research Center University of São Paulo, SP Department of Genetics and Evolutionary Biology Biosciences Institute University of São Paulo, SP Hospital Israelita Albert Einstein, SP Instituto da Criança Faculdade de Medicina da Universidade de São Paulo, SP Orthopedic Research Labs Boston Children’s Hospital and Department of Genetics Harvard Medical School Laboratório DASA Laboratory of Genome Structure and Ageing European Research Institute for the Biology of Ageing University Medical Center Groningen São Paulo State University (UNESP) Molecular Genetics and Bioinformatics Laboratory School of Medicine, State of São Paulo São Paulo State University (UNESP) Department of Pathology School of Medicine, State of São Paulo Departamento de Química Faculdade de Filosofia Ciências e Letras de Ribeirão Preto Universidade de São Paulo, São Paulo Centro de Oncologia Molecular Hospital Sirio-Libanes Department of Biochemistry Institute of Chemistry University of São Paulo São Paulo Bioinformatics Graduate program University of São Paulo Departamento de Genética Ecologia e Evolução Instituto de Ciências Biológicas Universidade Federal de Minas Gerais, MG Núcleo de Ensino e Pesquisa Instituto Mário Penna, MG Laboratorio de Biotecnologia y Biologia Molecular Instituto Nacional de Salud Universidad de Huánuco Division of Cancer Epidemiology and Genetics National Cancer Institute Instituto de Saúde Coletiva Universidade Federal da Bahia, BA Center for Data and Knowledge Integration for Health Institute Gonçalo Muniz Fundação Oswaldo Cruz, BA Instituto de Pesquisas René Rachou Fundação Oswaldo Cruz, MG Programa De Pós-Graduação em Saúde Pública Universidade Federal de Minas Gerais, MG Programa de Pós-Graduação em Epidemiologia Universidade Federal de Pelotas, RS Mosaico Translational Genomics Initiative Universidade Federal de Minas Gerais, MG Facultad de Salud Pública y Administración Universidad Peruana Cayetano Heredia Instituto de Estudos Avançados Transdisciplinares Universidade Federal de Minas Gerais, MG Medical-Surgical Nursing Department School of Nursing University of São Paulo, SP Epidemiology Department Public Health School University of São Paulo, SP São Paulo State University (UNESP) Molecular Genetics and Bioinformatics Laboratory School of Medicine, State of São Paulo São Paulo State University (UNESP) Department of Pathology School of Medicine, State of São Paulo National Institute of General Medical Sciences: R01 GM075091