11 results on '"Vinh Sy Le"'
Search Results
2. Novel Findings From Family-based Exome Sequencing for Children With Biliary Atresia
- Author
-
Lan T.M. Dao, Quynh Anh Tran, Huyen Khanh Nguyen, Vinh Sy Le, Liem Thanh Nguyen, Minh Duy Ngo, Anh Kieu Mai, Ha Thi Nguyen, and Kien Trung Tran
- Subjects
Genetic Markers ,Male ,Proband ,Candidate gene ,Molecular biology ,Science ,Genes, Recessive ,Biology ,Article ,Cohort Studies ,Gene Frequency ,Biliary Atresia ,Genes, X-Linked ,Biliary atresia ,Exome Sequencing ,Genetics ,medicine ,Humans ,Genetic Predisposition to Disease ,1000 Genomes Project ,Genetic Association Studies ,Exome sequencing ,Adaptor Proteins, Signal Transducing ,Multidisciplinary ,Genetic heterogeneity ,Tumor Suppressor Proteins ,Medical genetics ,Infant, Newborn ,Genetic Variation ,Infant ,medicine.disease ,Phosphoric Monoester Hydrolases ,Minor allele frequency ,Vietnam ,Mutation ,Medicine ,Female ,OCRL ,Transcription Factors - Abstract
Biliary atresia (BA) is a progressive inflammation and fibrosis of the biliary tree, characterized by the obstruction of bile flow led to liver failure, scarring and cirrhosis. This study aimed to explore the elusive etiology of BA by conducting whole exome sequencing (WES) for 41 children with BA and their parents (35 trios, including one family with two BA diagnosed children and five child-mother cases). We exclusively identified and validated a total of 28 variants (17 X-linked, six de novo and five homozygous) in 25 candidate genes from our BA cohort. These variants were among the 10% most deleterious and having a low minor allele frequency against three employed databases: Kinh Vietnamese (KHV), gnomad and 1000 Genome project. Interestingly, AMER1, INVS and OCRL variants were repeatedly found in unrelated probands, and were firstly reported in a BA cohort. Liver specimens and blood samples showed identical variants, suggesting that somatic mutations were unlikely to occur during the morphogenesis. In agreement with earlier attempts, this study implicated a genetical heterogeneity and non-Mendelian inheritance of BA.
- Published
- 2021
3. A Vietnamese human genetic variation database
- Author
-
Canh D. Nguyen, Linh T. D. Pham, Lan T.M. Dao, Huong Le, Duong Huy Do, Vinh Sy Le, Liem Thanh Nguyen, Ha T. T. Ly, Hoa T. P. Bui, and Kien Trung Tran
- Subjects
Whole genome sequencing ,0303 health sciences ,education.field_of_study ,Database ,Vietnamese ,030305 genetics & heredity ,Population ,Single-nucleotide polymorphism ,Human genetic variation ,Biology ,computer.software_genre ,Genome ,language.human_language ,03 medical and health sciences ,Genetics ,language ,Human genome ,education ,computer ,Genetics (clinical) ,Exome sequencing ,030304 developmental biology - Abstract
Large scale human genome projects have created tremendous human genome databases for some well-studied populations. Vietnam has about 95 million people (the 14th largest country by population in the world) of which more than 86% are Kinh people. To date, genetic studies for Vietnamese people mostly rely on genetic information from other populations. Building a Vietnamese human genetic variation database is a must for properly interpreting Vietnamese genetic variants. To this end, we sequenced 105 whole genomes and 200 whole exomes of 305 unrelated Kinh Vietnamese (KHV) people. We also included 101 other previously published KHV genomes to build a Vietnamese human genetic variation database of 406 KHV people. The KHV database contains 24.81 million variants (22.47 million single nucleotide polymorphisms (SNPs) and 2.34 million indels) of which 0.71 million variants are novel. It includes more than 99.3% of variants with a frequency of >1% in the KHV population. Noticeably, the KHV database revealed 107 variants reported in the human genome mutation database as pathological mutations with a frequency above 1% in the KHV population. The KHV database (available at https://genomes.vn) would be beneficial for genetic studies and medical applications not only for the Vietnamese population but also for other closely related populations.
- Published
- 2019
4. iK-means: an improvement of the iterative k-means partitioning algorithm
- Author
-
Dong Do Due, Thao Nguyen Thi Phuong, Thang Bui Ngoc, Thu Kim Le, and Vinh Sy Le
- Subjects
Similarity (geometry) ,Phylogenetic tree ,Computer science ,k-means clustering ,Gene Feature ,Invariant (mathematics) ,Partition (database) ,Algorithm - Abstract
The evolutionary processes vary among sites of an alignment that strongly affect the accuracy of phylogenetic tree reconstruction. Partitioning an alignment into sub-alignments of sites such that the evolutionary processes at sites in the same sub-alignment are highly similar is a proper strategy. Gene features might be used as reasonable indicators to partition an alignment. However, the gene feature information is not always available or efficient Computational partitioning methods like iterative k-means has been proposed to automatically partition sites into groups based on the similarity of evolutionary rates of sites. Despite obtaining compelling results in terms of AICc and BIC measurements, the k-means method forms a group of all and only invariant sites that results in bias/wrong phylogenetic trees. In this paper, we improve the k-means algorithm by re-classifying invariant sites into different sub-alignments based on their likelihood values. Experimental results on simulated and empirical DNA datasets showed that the new method, called iK-means, overcame the pitfall of the K-means method, consequently, helps improve the quality of the partitioning sub-alignments. We recommend using the iK-means method to level up the accuracy in inferring phylogenetic trees.
- Published
- 2020
5. A protein alignment partitioning method for protein phylogenetic inference
- Author
-
Vinh Sy Le and Thu Kim Le
- Subjects
Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Protein sequencing ,Phylogenetic tree ,Phylogenetic inference ,Computer science ,Model selection ,A protein ,Computational biology ,Genome ,Gene - Abstract
Phylogenetic trees inferred from protein sequences are strongly affected by amino acid evolutionary models. Choosing proper models are needed to account for the heterogeneity in evolutionary patterns across sites, especially when analyzing multiple genes or whole genome datasets. Partitioning is a prominent approach to combine sites undergone similar evolutionary processes into separated groups with proper models. The partitioning scheme can be defined by using structural features of the sequences, however, determining structural features of protein sequences is not always practical. Recently, methods have been proposed to automatically cluster sites into groups based on the rates of sites. The rate of sites is a good indicator; however, it is unable to properly reflex the complex evolutionary processes of sites along the protein sequence. In this paper, we present a new algorithm to automatically determine a partitioning scheme based on the best-fit model of sites, i.e., sites belong to the same model will be classified into the same group. Comparing our proposed method with current methods on a set of empirical protein datasets showed that our method helped to build better trees than other methods tested. Our method will significantly improve protein phylogenetic inference from multiple gene or whole genome datasets.
- Published
- 2020
6. Response to: A commentary on 'A Vietnamese human genetic variation database'
- Author
-
Liem Thanh Nguyen, Vinh Sy Le, and Kien Trung Tran
- Subjects
0303 health sciences ,Polymorphism, Genetic ,Database ,Vietnamese ,030305 genetics & heredity ,Genetic Variation ,Human genetic variation ,Biology ,Southeast asian ,computer.software_genre ,language.human_language ,Southeast asia ,03 medical and health sciences ,Asian People ,Genetics ,language ,Humans ,East Asia ,Settlement (litigation) ,Relation (history of concept) ,computer ,Genetics (clinical) ,030304 developmental biology - Abstract
This letter is a response to the commentary by Jonson & Do (Johnson and Do 2020) on our paper, entitled “A Vietnamese human genetic variation database” (Vinh et al. 2019). The commentators concerned about two issues: Firstly, the relation of Southeast Asian (SEA) and East Asian (EA) groups to African and European groups; Secondly, the history of migration and settlement in Southeast Asia. Our responses will clarify both concerns from the commentators.
- Published
- 2020
7. A novel de�novo variant of LAMA2 contributes to merosin deficient congenital muscular dystrophy type 1A: Case report
- Author
-
Liem Thanh Nguyen, Kien Trung Tran, Chinh Duy Vu, and Vinh Sy Le
- Subjects
0301 basic medicine ,Proband ,de novo ,LAMA2 gene ,merosin deficient congenital muscular dystrophy type 1A ,General Biochemistry, Genetics and Molecular Biology ,whole exome sequencing ,03 medical and health sciences ,Exon ,symbols.namesake ,0302 clinical medicine ,Medicine ,Missense mutation ,General Pharmacology, Toxicology and Pharmaceutics ,Muscular dystrophy ,Gene ,Exome sequencing ,Muscle contracture ,Genetics ,Sanger sequencing ,business.industry ,General Neuroscience ,Articles ,General Medicine ,medicine.disease ,030104 developmental biology ,030220 oncology & carcinogenesis ,symbols ,business - Abstract
Merosin deficient congenital muscular dystrophy type 1A (MDC1A) is caused by defects in the LAMA2 gene. Patients with MDC1A exhibit severe symptoms, including congenital hypotonia, delayed motor development and contractures. The present case report describes a Vietnamese male child with clinical manifestations of delayed motor development, limb-girdle muscular dystrophy, severe scoliosis and white matter abnormality in the brain. Whole exome sequencing (WES) was performed with subsequent validation using Sanger sequencing, and a de novo missense variant (NM_000426.3:c.1964T>C, p.Leu655Pro) and a splice site variant (NG_008678.1:c.3556-13T>A) in the LAMA2 gene of the proband was detected. The missense variant located in exon 14 and has not been reported previously, to the best of our knowledge; whereas the splice site variant has been previously reported to cause premature termination of transcription in patients with MDC1A. In silico tools predicted that the missense variant was damaging. Phenotype-genotype analysis suggested that this proband was associated with classical early onset MDC1A. The co-existence of a de novo and a heterozygous variant in the LAMA2 gene suggested that the de novo variant contributed to the autosomal recessive manner of the disease. Careful consideration of this event by clinical confirmation of parental carrier status may help to accurately determine the risk of occurrence of this disease in future offspring. Additionally, WES is recommended as a powerful tool to assist in identifying potentially causative variants for heterogeneous diseases such as MDC1A.
- Published
- 2019
8. Building a Specific Amino Acid Substitution Model for Dengue Viruses
- Author
-
Vinh Sy Le, Thu Le Kim, and Cuong Dang Cao
- Subjects
0301 basic medicine ,Genetics ,chemistry.chemical_classification ,Phylogenetic tree ,Maximum likelihood ,Amino acid substitution ,Biology ,medicine.disease ,Dengue fever ,Amino acid ,03 medical and health sciences ,Human health ,030104 developmental biology ,chemistry ,medicine - Abstract
Phylogenetic trees inferred from protein sequences are strongly affected by amino acid substitution models. Although different amino acid substitution models have been proposed, only a few were estimated for specific species such as the FLU model for influenza viruses. Among the most dangerous viruses for human health, dengue is always on top and the cause of dengue fever up to 100 million people per year. In this study, we built a specific amino acid substitution model for dengue protein sequences, called DEN. The dengue protein sequences were obtained from the NCBI dengue database and the model was estimated using the maximum likelihood method. Experiments showed that the new model DEN helped to build better phylogenetic trees than other existing models. We strongly recommend researchers to use the DEN model for analyzing dengue protein data.
- Published
- 2018
9. UFBoot2: Improving the Ultrafast Bootstrap Approximation
- Author
-
Arndt von Haeseler, Vinh Sy Le, Bui Quang Minh, Olga Chernomor, and Diep Thi Hoang
- Subjects
0106 biological sciences ,0301 basic medicine ,polytomies ,Phylogenetic inference ,Computer science ,Maximum likelihood ,Biology ,010603 evolutionary biology ,01 natural sciences ,ultrafast bootstrap ,model violation ,03 medical and health sciences ,phylogenetic inference ,Genetics ,maximum likelihood ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Phylogeny ,Bootstrapping (statistics) ,Likelihood Functions ,Models, Genetic ,Software package ,Resources ,030104 developmental biology ,Algorithm ,Software - Abstract
The standard bootstrap (SBS), despite being computationally intensive, is widely used in maximum likelihood phylogenetic analyses. We recently proposed the ultrafast bootstrap approximation (UFBoot) to reduce computing time while achieving more unbiased branch supports than SBS under mild model violations. UFBoot has been steadily adopted as an efficient alternative to SBS and other bootstrap approaches.Here, we present UFBoot2, which substantially accelerates UFBoot and reduces the risk of overestimating branch supports due to polytomies or severe model violations. Additionally, UFBoot2 provides suitable bootstrap resampling strategies for phylogenomic data. UFBoot2 is 778 and 8.4 times (median) faster than SBS and RAxML rapid bootstrap on tested datasets, respectively. UFBoot2 is implemented in the IQ-TREE software package version 1.6 and freely available at http://www.iqtree.org.
- Published
- 2017
10. Building Ancestral Recombination Graphs for Whole Genomes
- Author
-
Hai Bich Ho, Vinh Sy Le, Quang Si Le, and Thao Thi Phuong Nguyen
- Subjects
0301 basic medicine ,Population ,Inference ,Genome-wide association study ,Genomics ,Biology ,Genome ,03 medical and health sciences ,Databases, Genetic ,Genetics ,Humans ,Association mapping ,education ,Genetic association ,Recombination, Genetic ,education.field_of_study ,Models, Genetic ,Applied Mathematics ,030104 developmental biology ,Genetics, Population ,Evolutionary biology ,Recombination ,Algorithms ,Biotechnology ,Genome-Wide Association Study - Abstract
We propose a heuristic algorithm, called ARG4WG, to build plausible ancestral recombination graphs (ARGs) from thousands of whole genome samples. By using the longest shared end for recombination inference, ARG4WG constructs ARGs with small numbers of recombination events that perform well in association mapping on genome-wide association studies.
- Published
- 2016
11. ReplacementMatrix: a web server for maximum-likelihood estimation of amino acid replacement rate matrices
- Author
-
Quang Si Le, Vinh Sy Le, Cuong Cao Dang, Vincent Lefort, and Olivier Gascuel
- Subjects
Statistics and Probability ,Web server ,Theoretical computer science ,Computer science ,computer.software_genre ,Biochemistry ,Domain (software engineering) ,Set (abstract data type) ,Matrix (mathematics) ,Phylogenetics ,Amino Acids ,Amino acid replacement ,Molecular Biology ,Phylogeny ,Probability ,Internet ,Likelihood Functions ,Basis (linear algebra) ,Proteins ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Sequence Alignment ,Algorithm ,computer ,Software - Abstract
Summary: Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number of general purpose matrices have been proposed (e.g. JTT, WAG, LG) since the seminal work of Margaret Dayhoff and co-workers. However, it has been shown that matrices specific to certain protein groups (e.g. mitochondrial) or life domains (e.g. viruses) differ significantly from general average matrices, and thus perform better when applied to the data to which they are dedicated. This Web server implements the maximum-likelihood estimation procedure that was used to estimate LG, and provides a number of tools and facilities. Users upload a set of multiple protein alignments from their domain of interest and receive the resulting matrix by email, along with statistics and comparisons with other matrices. A non-parametric bootstrap is performed optionally to assess the variability of replacement rate estimates. Maximum-likelihood trees, inferred using the estimated rate matrix, are also computed optionally for each input alignment. Finely tuned procedures and up-to-date ML software (PhyML 3.0, XRATE) are combined to perform all these heavy calculations on our clusters. Availability: http://www.atgc-montpellier.fr/ReplacementMatrix/ Contact: olivier.gascuel@lirmm.fr Supplementary information: Supplementary data are available at http://www.atgc-montpellier.fr/ReplacementMatrix/
- Published
- 2011
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.