204 results on '"Shili Lin"'
Search Results
2. Data from MicroRNAs modulate the chemosensitivity of tumor cells
- Author
-
Wolfgang Sadee, John N. Weinstein, Carlo M. Croce, William C. Reinhold, Thomas D. Schmittgen, Chang-Gong Liu, Zunyan Dai, Jong-Kook Park, Shili Lin, Joseph S. Verducci, Ji-Hyun Chung, and Paul E. Blower
- Abstract
MicroRNAs are strongly implicated in such processes as development, carcinogenesis, cell survival, and apoptosis. It is likely, therefore, that they can also modulate sensitivity and resistance to anticancer drugs in substantial ways. To test this hypothesis, we studied the pharmacologic roles of three microRNAs previously implicated in cancer biology (let-7i, mir-16, and mir-21) and also used in silico methods to test pharmacologic microRNA effects more broadly. In the experimental system, we increased the expression of individual microRNAs by transfecting their precursors (which are active) or suppressed the expression by transfection of antisense oligomers. In three NCI-60 human cancer cell lines, a panel of 60 lines used for anticancer drug discovery, we assessed the growth-inhibitory potencies of 14 structurally diverse compounds with known anticancer activities. Changing the cellular levels of let-7i, mir-16, and mir-21 affected the potencies of a number of the anticancer agents by up to 4-fold. The effect was most prominent with mir-21, with 10 of 28 cell-compound pairs showing significant shifts in growth-inhibitory activity. Varying mir-21 levels changed potencies in opposite directions depending on compound class; indicating that different mechanisms determine toxic and protective effects. In silico comparison of drug potencies with microRNA expression profiles across the entire NCI-60 panel revealed that ∼30 microRNAs, including mir-21, show highly significant correlations with numerous anticancer agents. Ten of those microRNAs have already been implicated in cancer biology. Our results support a substantial role for microRNAs in anticancer drug response, suggesting novel potential approaches to the improvement of chemotherapy. [Mol Cancer Ther 2008;7(1):1–9]
- Published
- 2023
3. Data from MicroRNA expression profiles for the NCI-60 cancer cell panel
- Author
-
Wolfgang Sadee, John N. Weinstein, Carlo M. Croce, Eric P. Kaldjian, Philip L. Lorenzi, William Reinhold, Chang-Gong Liu, Zunyan Dai, Ji-Hyun Chung, Jin Zhou, Shili Lin, Joseph S. Verducci, and Paul E. Blower
- Abstract
Advances in the understanding of cancer cell biology and response to drug treatment have benefited from new molecular technologies and methods for integrating information from multiple sources. The NCI-60, a panel of 60 diverse human cancer cell lines, has been used by the National Cancer Institute to screen >100,000 chemical compounds and natural product extracts for anticancer activity. The NCI-60 has also been profiled for mRNA and protein expression, mutational status, chromosomal aberrations, and DNA copy number, generating an unparalleled public resource for integrated chemogenomic studies. Recently, microRNAs have been shown to target particular sets of mRNAs, thereby preventing translation or accelerating mRNA turnover. To complement the existing NCI-60 data sets, we have measured expression levels of microRNAs in the NCI-60 and incorporated the resulting data into the CellMiner program package for integrative analysis. Cell line groupings based on microRNA expression were generally consistent with tissue type and with cell line clustering based on mRNA expression. However, mRNA expression seemed to be somewhat more informative for discriminating among tissue types than was microRNA expression. In addition, we found that there does not seem to be a significant correlation between microRNA expression patterns and those of known target transcripts. Comparison of microRNA expression patterns and compound potency patterns showed significant correlations, suggesting that microRNAs may play a role in chemoresistance. Combined with gene expression and other biological data using multivariate analysis, microRNA expression profiles may provide a critical link for understanding mechanisms involved in chemosensitivity and chemoresistance. [Mol Cancer Ther 2007;6(5):1483–91]
- Published
- 2023
4. Supplementary Data from MicroRNAs modulate the chemosensitivity of tumor cells
- Author
-
Wolfgang Sadee, John N. Weinstein, Carlo M. Croce, William C. Reinhold, Thomas D. Schmittgen, Chang-Gong Liu, Zunyan Dai, Jong-Kook Park, Shili Lin, Joseph S. Verducci, Ji-Hyun Chung, and Paul E. Blower
- Abstract
Supplementary Data from MicroRNAs modulate the chemosensitivity of tumor cells
- Published
- 2023
5. Data from Breast Cancer–Associated Fibroblasts Confer AKT1-Mediated Epigenetic Silencing of Cystatin M in Epithelial Cells
- Author
-
Tim H.-M. Huang, Michael C. Ostrowski, Ann-Lii Cheng, Pearlly S. Yan, Shili Lin, Lisa Asamoto, Dustin Potter, Daniel E. Deatherage, Rulong Shen, Shuying Sun, Sandya Liyanarachchi, Chieh Ti Kuo, Ching-Hung Lin, Tao Zuo, and Huey-Jen L. Lin
- Abstract
The interplay between histone modifications and promoter hypermethylation provides a causative explanation for epigenetic gene silencing in cancer. Less is known about the upstream initiators that direct this process. Here, we report that the Cystatin M (CST6) tumor suppressor gene is concurrently down-regulated with other loci in breast epithelial cells cocultured with cancer-associated fibroblasts (CAF). Promoter hypermethylation of CST6 is associated with aberrant AKT1 activation in epithelial cells, as well as the disabled INNP4B regulator resulting from the suppression by CAFs. Repressive chromatin, marked by trimethyl-H3K27 and dimethyl-H3K9, and de novo DNA methylation is established at the promoter. The findings suggest that microenvironmental stimuli are triggers in this epigenetic cascade, leading to the long-term silencing of CST6 in breast tumors. Our present findings implicate a causal mechanism defining how tumor stromal fibroblasts support neoplastic progression by manipulating the epigenome of mammary epithelial cells. The result also highlights the importance of direct cell-cell contact between epithelial cells and the surrounding fibroblasts that confer this epigenetic perturbation. Because this two-way interaction is anticipated, the described coculture system can be used to determine the effect of epithelial factors on fibroblasts in future studies. [Cancer Res 2008;68(24):10257–66]
- Published
- 2023
6. Supplementary Methods, Figures 1-5, Tables 1-5 from Breast Cancer–Associated Fibroblasts Confer AKT1-Mediated Epigenetic Silencing of Cystatin M in Epithelial Cells
- Author
-
Tim H.-M. Huang, Michael C. Ostrowski, Ann-Lii Cheng, Pearlly S. Yan, Shili Lin, Lisa Asamoto, Dustin Potter, Daniel E. Deatherage, Rulong Shen, Shuying Sun, Sandya Liyanarachchi, Chieh Ti Kuo, Ching-Hung Lin, Tao Zuo, and Huey-Jen L. Lin
- Abstract
Supplementary Methods, Figures 1-5, Tables 1-5 from Breast Cancer–Associated Fibroblasts Confer AKT1-Mediated Epigenetic Silencing of Cystatin M in Epithelial Cells
- Published
- 2023
7. Sequencing-Based DNA Methylation Data
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
8. Bioinformatics Methods
- Author
-
Sujay Datta, Denise Scholtens, and Shili Lin
- Published
- 2022
9. Modeling and Analysis of Next-Generation Sequencing Data
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
10. Metabolomics Data Preprocessing
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
11. Protein-Protein Interactions
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
12. Metabolomics Data Analysis
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
13. Protein-Protein Interaction Network Analyses
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
14. Digital Improvement of Single Cell Hi-C Data
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
15. Detection of Imprinting and Maternal Effects
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
16. Modeling and Analysis of Spatial Chromatin Interactions
- Author
-
Shili Lin, Denise Scholtens, and Sujay Datta
- Published
- 2022
17. COMPARISON BETWEEN PULMONARY ARTERIAL HYPERTENSION (PAH) RISK ASSESSMENT METHODS, INCLUDING PULMONARY HYPERTENSION OUTCOME RISKS ASSESSMENT (PHORA)
- Author
-
CHARLES FAUVEL, ZILU LIU, SHILI LIN, PRISCILLA CORREA-JAQUE, AMY WEBB, REBECCA R VANDERPOOL, MANREET KANWAR, JIDAPA KRAISANGKA, PUNEET MATHUR, ADAM PERER, ALLEN D EVERETT, and RAYMOND L BENZA
- Subjects
Pulmonary and Respiratory Medicine ,Cardiology and Cardiovascular Medicine ,Critical Care and Intensive Care Medicine - Published
- 2022
18. An Adaptive and Robust Test for Microbial Community Analysis
- Author
-
Qingyu, Chen, Shili, Lin, and Chi, Song
- Subjects
Genetics ,Molecular Medicine ,Genetics (clinical) - Abstract
In microbiome studies, researchers measure the abundance of each operational taxon unit (OTU) and are often interested in testing the association between the microbiota and the clinical outcome while conditional on certain covariates. Two types of approaches exists for this testing purpose: the OTU-level tests that assess the association between each OTU and the outcome, and the community-level tests that examine the microbial community all together. It is of considerable interest to develop methods that enjoy both the flexibility of OTU-level tests and the biological relevance of community-level tests. We proposed MiAF, a method that adaptively combines p-values from the OTU-level tests to construct a community-level test. By borrowing the flexibility of OTU-level tests, the proposed method has great potential to generate a series of community-level tests that suit a range of different microbiome profiles, while achieving the desirable high statistical power of community-level testing methods. Using simulation study and real data applications in a smoker throat microbiome study and a HIV patient stool microbiome study, we demonstrated that MiAF has comparable or better power than methods that are specifically designed for community-level tests. The proposed method also provides a natural heuristic taxa selection.
- Published
- 2022
19. BCurve: Bayesian Curve Credible Bands Approach for the Detection of Differentially Methylated Regions
- Author
-
Chenggong, Han, Jincheol, Park, and Shili, Lin
- Subjects
Bayes Theorem ,Genomics ,Sequence Analysis, DNA ,DNA Methylation ,Software - Abstract
High-throughput assays have been developed to measure DNA methylation, among which bisulfite-based sequencing (BS-seq) and microarray technologies are the most popular for genome-wide profiling. A major goal in DNA methylation analysis is the detection of differentially methylated genomic regions under two different conditions. To accomplish this, many state-of-the-art methods have been proposed in the past few years; only a handful of these methods are capable of analyzing both types of data (BS-seq and microarray), though. On the other hand, covariates, such as sex and age, are known to be potentially influential on DNA methylation; and thus, it would be important to adjust for their effects on differential methylation analysis. In this chapter, we describe a Bayesian curve credible bands approach and the accompanying software, BCurve, for detecting differentially methylated regions for data generated from either microarray or BS-Seq. The unified theme underlying the analysis of these two different types of data is the model that accounts for correlation between DNA methylation in nearby sites, covariates, and between-sample variability. The BCurve R software package also provides tools for simulating both microarray and BS-seq data, which can be useful for facilitating comparisons of methods given the known "gold standard" in the simulated data. We provide detailed description of the main functions in BCurve and demonstrate the utility of the package for analyzing data from both platforms using simulated data from the functions provided in the package. Analyses of two real datasets, one from BS-seq and one from microarray, are also furnished to further illustrate the capability of BCurve.
- Published
- 2022
20. An Adaptive and Robust Method for Multi-trait Analysis of Genome-wide Association Studies Using Summary Statistics
- Author
-
Qiaolan Deng, Chi Song, and Shili Lin
- Subjects
Methodology (stat.ME) ,FOS: Computer and information sciences ,Genetics ,Genetics (clinical) ,Statistics - Methodology - Abstract
Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with human traits or diseases in the past decade. Nevertheless, much of the heritability of many traits is still unaccounted for. Commonly used single-trait analysis methods are conservative, while multi-trait methods improve statistical power by integrating association evidence across multiple traits. In contrast to individual-level data, GWAS summary statistics are usually publicly available, and thus methods using only summary statistics have greater usage. Although many methods have been developed for joint analysis of multiple traits using summary statistics, there are many issues, including inconsistent performance, computational inefficiency, and numerical problems when considering lots of traits. To address these challenges, we propose a multi-trait adaptive Fisher method for summary statistics (MTAFS), a computationally efficient method with robust power performance. We applied MTAFS to two sets of brain image-derived phenotypes (IDPs) from the UK Biobank, including a set of 58 Volumetric IDPs and a set of 212 Area IDPs. Together with results from a simulation study, MTAFS shows its advantage over existing multi-trait methods, with robust performance across a range of underlying settings. It controls type 1 error well, and can efficiently handle a large number of traits.
- Published
- 2022
21. Human alveolar macrophage response toMycobacterium tuberculosis: immune characteristics underlying large inter-individual variability
- Author
-
Wolfgang Sadee, Ian H. Cheeseman, Audrey Papp, Maciej Pietrzak, Michal Seweryn, Xiaofei Zhou, Shili Lin, Amanda M. Williams, Eusondia Arnett, Abul K. Azad, and Larry S. Schlesinger
- Abstract
Mycobacterium tuberculosis(M.tb) establishes residence and growth in human alveolar macrophages (AMs). Large inter-individual variation inM.tb-AM interactions is a potential early indicator of TB risk and efficacy of therapies and vaccines. Herein, we systematically analyze interactions of a virulentM.tbstrain with freshly isolated human AMs from 28 healthy adult donors, measuring host RNA expression and secreted candidate proteins associated with TB pathogenesis over 72h. We observe large inter-individual differences in bacterial uptake and growth, with tenfold variation inM.tbload at 72h, reflected by large variation of gene expression programs. Systems analysis of differential and variable RNA and protein expression identifies TB-associated genes and networks (e.g., IL1BandIDO1). RNA time profiles document early stimulation of M1-type macrophage gene expression followed by emergence of an M2-type profile. The fine-scale resolution of this work enables the separation of genes and networks regulating earlyM.tbgrowth dynamics, and development of potential markers of individual susceptibility toM.tbinfection and response to therapies.
- Published
- 2022
22. Metagenome-Predicted Growth Rate And Metatranscriptomic Analysis Reveal A Slow Growth Rate Of And High Butyrate Formation By Faecalibacterium Prausnitzii In The Rumen Of Low Methane-Emitting Sheep
- Author
-
Boyang Zhang, Shili Lin, and Zhongtang Yu
- Abstract
Background: Methane emissions from ruminants contribute to global warming and lead to energy loss from the ingested feed. Reducing methane emissions while increasing feed efficiency is one of the most important challenges facing the livestock industry. Previous studies have reported differences in lactate-metabolizing bacteria in the rumen between low-methane yield (LMY) and high-methane yield (HMY) sheep. It was hypothesized that methane emissions might also be related to the growth of certain bacteria. The objective of this study was to investigate the correlation between the growth and metabolism of rumen bacteria and methane emissions in sheep.Results: The growth rates of 21 species-level and 12 genera-level rumen bacterial populations were predicted based on the peak-to-trough ratio of their metagenomic sequences that were generated from two groups of sheep differing in methane yield (LMY vs. HMY). The growth rate of Faecalibacterium prausnitzii was found to be significantly different between the LMY sheep and the HMY sheep. The relative abundance of F. prausnitzii was significantly lower in the LMY sheep than the HMY sheep, whereas the relative abundance of Intestinibaculum porci and Megasphaera elsdenii was significantly higher in the LMY sheep than the HMY sheep. Metatranscriptomic analysis showed that in the LMY sheep the expression of energy-related genes of F. prausnitzii, including those involved in butyrate production, was significantly upregulated compared to the HMY sheep. Conclusions: The current study revealed a negative association between the growth rate of F. prausnitzii in the rumen and methane yield in sheep and between the growth rate of F. prausnitzii and the expression of its genes involved in energy metabolism including butyrate production. Our result also showed enrichment of lactate-producing bacteria (i.e., I. porci) and lactate-utilizing bacteria (i.e., M. elsdenii) in the LMY sheep. Together with the reported metabolic response of F. prausnitzii to lactic acid bacteria, our study corroborates the association between lactate-metabolizing bacteria and methane emissions via promoting butyrate production. M. elsdenii, I. porci, and F. prausnitzii may serve as biomarkers of methane yield from sheep, and possibly other ruminants.
- Published
- 2022
23. Detecting X‐linked common and rare variant effects in family‐based sequencing studies
- Author
-
Asuman S. Turkmen and Shili Lin
- Subjects
Multifactorial Inheritance ,education.field_of_study ,Models, Genetic ,Epidemiology ,Population ,Genetic Variation ,High-Throughput Nucleotide Sequencing ,Genome-wide association study ,Computational biology ,Biology ,DNA sequencing ,Genes, X-Linked ,Missing heritability problem ,Genetic variation ,Humans ,Identification (biology) ,Heritability of autism ,education ,Genetics (clinical) ,Genome-Wide Association Study ,Genetic association - Abstract
The breakthroughs in next generation sequencing have allowed us to access data consisting of both common and rare variants, and in particular to investigate the impact of rare genetic variation on complex diseases. Although rare genetic variants are thought to be important components in explaining genetic mechanisms of many diseases, discovering these variants remains challenging, and most studies are restricted to population-based designs. Further, despite the shift in the field of genome-wide association studies (GWAS) towards studying rare variants due to the "missing heritability" phenomenon, little is known about rare X-linked variants associated with complex diseases. For instance, there is evidence that X-linked genes are highly involved in brain development and cognition when compared with autosomal genes; however, like most GWAS for other complex traits, previous GWAS for mental diseases have provided poor resources to deal with identification of rare variant associations on X-chromosome. In this paper, we address the two issues described above by proposing a method that can be used to test X-linked variants using sequencing data on families. Our method is much more general than existing methods, as it can be applied to detect both common and rare variants, and is applicable to autosomes as well. Our simulation study shows that the method is efficient, and exhibits good operational characteristics. An application to the University of Miami Study on Genetics of Autism and Related Disorders also yielded encouraging results.
- Published
- 2020
24. Detecting rare haplotypes associated with complex diseases using both population and family data: Combined logistic Bayesian Lasso
- Author
-
Xiaofei Zhou, Meng Wang, and Shili Lin
- Subjects
Statistics and Probability ,Linkage disequilibrium ,Epidemiology ,Computer science ,Bayesian probability ,Population ,Computational biology ,Polymorphism, Single Nucleotide ,01 natural sciences ,Linkage Disequilibrium ,010104 statistics & probability ,03 medical and health sciences ,Health Information Management ,Lasso (statistics) ,Humans ,0101 mathematics ,education ,030304 developmental biology ,Genetic association ,0303 health sciences ,education.field_of_study ,Models, Genetic ,Haplotype ,Bayes Theorem ,Genetic architecture ,Bayesian lasso ,Haplotypes ,Case-Control Studies - Abstract
Haplotype-based association methods have been developed to understand the genetic architecture of complex diseases. Compared to single-variant-based methods, haplotype methods are thought to be more biologically relevant, since there are typically multiple non-independent genetic variants involved in complex diseases, and the use of haplotypes implicitly accounts for non-independence caused by linkage disequilibrium. In recent years, with the focus moving from common to rare variants, haplotype-based methods have also evolved accordingly to uncover the roles of rare haplotypes. One particular approach is regularization-based, with the use of Bayesian least absolute shrinkage and selection operator (Lasso) as an example. This type of methods has been developed for either case-control population data (the logistic Bayesian Lasso (LBL)) or family data (family-triad-based logistic Bayesian Lasso (famLBL)). In some situations, both family data and case-control data are available; therefore, it would be a waste of resources if only one of them could be analyzed. To make full usage of available data to increase power, we propose a unified approach that can combine both case-control and family data (combined logistic Bayesian Lasso (cLBL)). Through simulations, we characterized the performance of cLBL and showed the advantage of cLBL over existing methods. We further applied cLBL to the Framingham Heart Study data to demonstrate its utility in real data applications.
- Published
- 2020
25. The Association of ARMC5 with the Renin-Angiotensin-Aldosterone System, Blood Pressure, and Glycemia in African Americans
- Author
-
Joshua J. Joseph, James G. Wilson, Willa A. Hsueh, Xiaofei Zhou, Fabio R. Faucz, Shili Lin, Mihail Zilbermint, Sherita Hill Golden, Maya Lodish, Constantine A. Stratakis, and Annabel Berthon
- Subjects
Blood Glucose ,Male ,0301 basic medicine ,Endocrinology, Diabetes and Metabolism ,Clinical Biochemistry ,Blood Pressure ,030204 cardiovascular system & hematology ,Biochemistry ,Plasma renin activity ,Renin-Angiotensin System ,chemistry.chemical_compound ,0302 clinical medicine ,Endocrinology ,Primary aldosteronism ,Renin ,Prospective Studies ,Aldosterone ,Aged, 80 and over ,Adrenal gland ,Fasting ,Middle Aged ,medicine.anatomical_structure ,Female ,Adult ,medicine.medical_specialty ,Polymorphism, Single Nucleotide ,Young Adult ,03 medical and health sciences ,Internal medicine ,Renin–angiotensin system ,medicine ,Humans ,Clinical Research Articles ,Aged ,Armadillo Domain Proteins ,Glycated Hemoglobin ,business.industry ,Biochemistry (medical) ,Haplotype ,medicine.disease ,Black or African American ,Cross-Sectional Studies ,030104 developmental biology ,Blood pressure ,Haplotypes ,chemistry ,Hemoglobin ,business - Abstract
Context Armadillo repeat containing 5 (ARMC5) on chromosome 16 is an adrenal gland tumor suppressor gene associated with primary aldosteronism, especially among African Americans (AAs). We examined the association of ARMC5 variants with aldosterone, plasma renin activity (PRA), blood pressure, glucose, and glycosylated hemoglobin A1c (HbA1c) in community-dwelling AAs. Methods The Jackson Heart Study is a prospective cardiovascular cohort study in AAs with baseline data collection from 2000 to 2004. Kernel machine method was used to perform a single joint test to analyze for an overall association between the phenotypes of interest (aldosterone, PRA, systolic and diastolic blood pressure [SBP, DBP], glucose, and HbA1c) and the ARMC5 single nucleotide variants (SNVs) adjusted for age, sex, BMI, and medications; followed by Baysian Lasso methodology to identify sets of SNVs in terms of associated haplotypes with specific phenotypes. Results Among 3223 participants (62% female; mean age 55.6 (SD ± 12.8) years), the average SBP and DBP were 127 and 76 mmHg, respectively. The average fasting plasma glucose and HbA1c were 101 mg/dL and 6.0%, respectively. ARMC5 variants were associated with all 6 phenotypes. Haplotype TCGCC (ch16:31476015-31476093) was negatively associated, whereas haplotype CCCCTTGCG (ch16:31477195-31477460) was positively associated with SBP, DBP, and glucose. Haplotypes GGACG (ch16:31477790-31478013) and ACGCG (ch16:31477834-31478113) were negatively associated with aldosterone and positively associated with HbA1c and glucose, respectively. Haplotype GCGCGAGC (ch16:31471193-ch16:31473597(rs114871627) was positively associated with PRA and negatively associated with HbA1c. Conclusions ARMC5 variants are associated with aldosterone, PRA, blood pressure, fasting glucose, and HbA1c in community-dwelling AAs, suggesting that germline mutations in ARMC5 may underlie cardiometabolic disease in AAs.
- Published
- 2020
26. Incorporating information from markers in LD with test locus for detecting imprinting and maternal effects
- Author
-
Fangyuan Zhang and Shili Lin
- Subjects
Glutathione Peroxidase ,Models, Genetic ,Haplotype ,Maternal effect ,Inference ,Robustness (evolution) ,Locus (genetics) ,Computational biology ,Biology ,Article ,Linkage Disequilibrium ,Genomic Imprinting ,Glutathione Peroxidase GPX1 ,Haplotypes ,Genotype ,Genetics ,Humans ,Maternal Inheritance ,Autistic Disorder ,Imprinting (psychology) ,Genomic imprinting ,Algorithms ,Genetics (clinical) ,Genome-Wide Association Study - Abstract
Numerous statistical methods have been developed to explore genomic imprinting and maternal effects by identifying parent-of-origin patterns in complex human diseases. However, because most of these methods only use available locus-specific genotype data, it is sometimes impossible for them to infer the distribution of parental origin of a variant allele, especially when some genotypes are missing. In this article, we propose a two-step approach, LIMEhap, to improve upon a recent partial likelihood inference method. In the first step, the distribution of the missing genotypes is inferred through the construction of haplotypes by using information from nearby loci. In the second step, a partial likelihood method is applied to the inferred data. To substantiate the validity of the proposed procedures, we simulated data in a genomic region of gene GPX1. The results show that, by borrowing genetic information from nearby loci, the power of the proposed method can be close to that with complete genotype data at the locus of interest. Since the inference on the genotype distribution is made under the assumption of Hardy-Weinberg Equilibrium (HWE), we further studied the robustness of LIMEhap to violation of HWE. Finally, we demonstrate the utility of LIMEhap by applying it to an autism dataset.
- Published
- 2020
27. AIJ: joint test for simultaneous detection of imprinting and non-imprinting allelic expression imbalance
- Author
-
Fangyuan Zhang, Dao-Peng Chen, and Shili Lin
- Subjects
Male ,Heterozygote ,Single-nucleotide polymorphism ,RNA-Seq ,02 engineering and technology ,Computational biology ,Allelic Imbalance ,Biology ,Polymorphism, Single Nucleotide ,DNA sequencing ,Epigenesis, Genetic ,Mice ,Species Specificity ,0502 economics and business ,0202 electrical engineering, electronic engineering, information engineering ,QA1-939 ,Animals ,Humans ,Computer Simulation ,Epigenetics ,Imprinting (psychology) ,Allele ,Gene ,reciprocal cross design ,Applied Mathematics ,05 social sciences ,Genomics ,General Medicine ,genomic imprinting ,Mice, Inbred C57BL ,Computational Mathematics ,Phenotype ,Gene Expression Regulation ,rna-seq ,Modeling and Simulation ,Female ,020201 artificial intelligence & image processing ,General Agricultural and Biological Sciences ,Genomic imprinting ,050203 business & management ,TP248.13-248.65 ,Mathematics ,allelic expression imbalance ,Biotechnology - Abstract
Epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence. Genomic imprinting is an epigenetically regulated process by which imprinted genes are expressed in a parent-of-origin-specific manner. It can be confounded with a phenomenon, allelic expression imbalance (AEI), which, in this paper, refers to asymmetric expression of the two alleles of a heterozygous subject at a single nucleotide polymorphism not caused by imprinting (non-imprinting AEI). Since existing methods in the literature are not amenable to distinguishing imprinting from non-imprinting AEI for data without replicates, we propose AIJ, a joint test for simultaneous detection of imprinting and non-imprinting AEI that accounts for potential confounding using RNA-seq data based on a reciprocal cross design. Through a simulation study, we show that AIJ is more powerful compared to two frequently used methods that do not account for confounding. To illustrate the practical utility of AIJ, we applied the method to a mouse dataset and identified genes with the imprinting effect and/or non-imprinting AEI phenomenon, with some already confirmed in an existing database. The results are also largely consistent with a study on human data for a set of orthologous genes, affirming earlier conclusion in the literature that non-imprinting AEI events are evolutionarily conserved.
- Published
- 2020
28. Examining the rare disease assumption used to justify HWE testing with control samples
- Author
-
Virginia L Ma and Shili Lin
- Subjects
Genotype ,Computer science ,case-control study ,Genome-wide association study ,02 engineering and technology ,Disease ,rare disease assumption ,Polymorphism, Single Nucleotide ,Rare Diseases ,Gene Frequency ,0502 economics and business ,Statistics ,QA1-939 ,Prevalence ,0202 electrical engineering, electronic engineering, information engineering ,Humans ,Mass Screening ,Computer Simulation ,1000 Genomes Project ,Control (linguistics) ,Alleles ,Models, Genetic ,Genome, Human ,Applied Mathematics ,05 social sciences ,Genetic variants ,Genetic Variation ,Genetic data ,Bayes Theorem ,General Medicine ,Computational Mathematics ,hardy-weinberg equilibrium ,Case-Control Studies ,Modeling and Simulation ,genome-wide association studies ,020201 artificial intelligence & image processing ,1000 genomes project ,General Agricultural and Biological Sciences ,Rare disease assumption ,TP248.13-248.65 ,Mathematics ,Algorithms ,050203 business & management ,Biotechnology ,Genome-Wide Association Study ,Type I and type II errors - Abstract
Many statistical methods for analyzing genetic data, such as those used in genome-wide association studies, assume Hardy-Weinberg Equilibrium (HWE). Therefore, to use such methods, one must check whether the HWE assumption is valid. For a case-control study, researchers have recognized that Hardy Weinberg proportions will be distorted if the marker being tested happens to be associated with the disease. To alleviate this problem, many studies carry out HWE testing on controls only. A number of papers in the literature have justified this practice by making the rare disease assumption without providing rigorous theoretical basis for this justification. Even though many of the diseases studied today are common, whether it is justifiable to use controls to test for HWE when the disease is indeed rare remains an outstanding issue. In this study, we address the rare disease assumption as well as potential problems associated with testing for HWE using controls only, regardless of the prevalence of the disease. We carried out theoretical derivations and numerical studies; the latter were performed using simulated genotypes as well as data from the 1000 Genomes Project. The results from our study are striking: the type Ⅰ error can be severely inflated, regardless of whether the disease being investigated is rare or common. This study shows that, based on the common practice of using controls only to test for HWE, many genetic variants will be discarded erroneously, wasting valuable information and hindering the ability to detect disease-associated variants.
- Published
- 2020
29. BCurve: Bayesian Curve Credible Bands Approach for the Detection of Differentially Methylated Regions
- Author
-
Chenggong Han, Jincheol Park, and Shili Lin
- Published
- 2022
30. Abstract 11344: Association of Canonical Pathways with Length of Survival in Pulmonary Arterial Hypertension
- Author
-
Amy Webb, Hyoin An, Priscilla Correa-Jaque, Shili Lin, Zilu Liu, Hemant K Tiwari, Howard Wiener, and Raymond L Benza
- Subjects
Physiology (medical) ,Cardiology and Cardiovascular Medicine - Abstract
Introduction: Pulmonary arterial hypertension (PAH) is a chronic, progressive disease without cure. Treatment can improve outcome, but informed predictions with clinical and genomic measurements can guide treatment and therapy choices. Previous research has focused on identifying clinical variables that could predict future outcomes. The current research aims to find genomic variants that can predict survival time. Hypothesis: Methods: Whole genome sequencing was performed on stored samples from 221 PAH patients. Samples were included with Long survival greater than 7 years and Short survival with mortality less than 5. Variants were filtered for quality, assigned to genes, and filtered for function and population frequency. Genes are grouped based on Canonical Pathways defined in Ingenuity Pathway Analysis Results: Patient were 50% IPAH, 81% female, 97% european decent with a mean age of 54 at sample acquisition. Mean Follow-up post sample acquisition was 5 years. Mean long and short survival was 8.3 and 2.1 years, respectively. Of pathways containing more than one gene mutated in 3 or more samples, 29 pathways were associated with survival length. Biologically relevant pathways include Pentose Phosphate (p=0.005), IL-22 (p=0.006), Phospholipase C signaling (p=0.007), Endocannabinoid related pathways (p=0.01), and Thioredoxin pathway (p=0.015). A Neural network model based on the top pathways was constructed (figure) that predicted Long/Short survival. Conclusions: We identified biologically relevant pathways associated with a Long/Short survival time in PAH patients. Our neural network model for predicting Long/Short survival using the 29 ident ified pathways and has achieved excellent performance.
- Published
- 2021
31. Abstract 13420: Machine Learning for Risk Stratification in Pulmonary Arterial Hypertension - Can It Achieve the Gold Standard?
- Author
-
Jacqueline M Scott, Manreet Kanwar, Zilu Liu, Shili Lin, James Antaki, Adam Perer, and Raymond L Benza
- Subjects
Physiology (medical) ,Cardiology and Cardiovascular Medicine - Abstract
Background: Pulmonary arterial hypertension (PAH) is a fatal and difficult to treat disease due to patient inter-variability. Accurate risk stratification is necessary for guiding treatment but no PAH risk calculator has achieved "excellent performance" (receiver-operator AUC > 0.8). Conversion of REVEAL 2.0 to a Bayesian network has shown promising results (Pulmonary Hypertension Outcomes Risk Assessment or PHORA). In this study, a new Bayesian network model (PHORA 2.0) was developed with a novel network structure and optimal feature selection. Methods: Patient-level data had been previously aggregated and harmonized across six PAH clinical trials (AMBITION, PATENT-1/2, GRIPHON, SERAPHIN, FREEDOM-EV, ARIES-1/2); all patients were assessed at baseline. Forty-one variables were initially considered based on p-value ranking from previous meta-analyses, availability across trials, and expert opinion. Training data was created by random sampling of 80% of the harmonized dataset, dropping early censored patients (N = 2531), leaving 20% of the data as a test set (N = 626). Continuous variables were discretized through univariate decision trees using 10-fold cross-validation maximizing Brier score. Genetic search selected combinations of features that maximized ranked correlation (Kendall’s tau) with one-year survival, with increasing penalty for redundant features. Feature combinations were evaluated in augmented naïve Bayesian networks, the best model was selected by 10-fold cross-validation on training data. Final performance is reported as performance on the test set. Results: The final model achieved the best cross-validation AUC using 16 variables. Performance on the test set was an AUC = 0.85. The final model outperformed multiple risk calculators at test time (Figure 1). Conclusion: Bayesian network modeling coupled with genetic feature selection has discovered for the first time a one-year mortality model for PAH with excellent performance.
- Published
- 2021
32. scHiCSRS: A Self-Representation Smoothing Method with Gaussian Mixture Model for Imputing single cell Hi-C Data
- Author
-
Qing Xie and Shili Lin
- Subjects
Computer science ,Data quality ,Sampling (statistics) ,Sensitivity (control systems) ,Imputation (statistics) ,Cluster analysis ,Data structure ,Mixture model ,Algorithm ,Smoothing - Abstract
MotivationSingle cell Hi-C techniques make it possible to study cell-to-cell variability in genomic features. However, excess zeros are commonly seen in single cell Hi-C (scHi-C) data, making scHi-C matrices extremely sparse and bringing extra difficulties in downstream analysis. The observed zeros are a combination of two events: structural zeros for which the loci never interact due to underlying biological mechanisms, and dropouts or sampling zeros where the two loci interact but are not captured due to insufficient sequencing depth. Although quality improvement approaches have been proposed as an intermediate step for analyzing scHi-C data, little has been done to address these two types of zeros. We believe that differentiating between structural zeros and dropouts would benefit downstream analysis such as clustering.ResultsWe propose scHiCSRS, a self-representation smoothing method that improves the data quality, and a Gaussian mixture model that identifies structural zeros among observed zeros. scHiC-SRS not only takes spatial dependencies of a scHi-C 2D data structure into account but also borrows information from similar single cells. Through an extensive set of simulation studies, we demonstrate the ability of scHiCSRS for identifying structural zeros with high sensitivity and for accurate imputation of dropout values in sampling zeros. Downstream analysis for three real datasets show that data improved from scHiCSRS yield more accurate clustering of cells than simply using observed data or improved data from several comparison methods.Availability and ImplementationThe scHiCSRS R package, together with the processed real and simulated data used in this study, are available on Github at https://github.com/sl-lin/scHiCSRS.git.Contactshili@stat.osu.eduSupplementary informationSupplementary data are available online.
- Published
- 2021
33. HiCImpute: A Bayesian Hierarchical Model for Identifying Structural Zeros and Enhancing Single Cell Hi-C Data
- Author
-
Victor X. Jin, Chenggong Han, Qing Xie, and Shili Lin
- Subjects
Spatial Analysis ,Ecology ,Computer science ,Bayesian probability ,Inference ,Sampling (statistics) ,Bayes Theorem ,Data structure ,Chromatin ,Chromosomes ,Cellular and Molecular Neuroscience ,Computational Theory and Mathematics ,Modeling and Simulation ,Data quality ,Genetics ,Cluster Analysis ,Bayesian hierarchical modeling ,Imputation (statistics) ,Cluster analysis ,Molecular Biology ,Algorithm ,Ecology, Evolution, Behavior and Systematics - Abstract
Single cell Hi-C techniques enable one to study cell to cell variability in chromatin interactions. However, single cell Hi-C (scHi-C) data suffer severely from sparsity, that is, the existence of excess zeros due to insufficient sequencing depth. Complicate things further is the fact that not all zeros are created equal, as some are due to loci truly not interacting because of the underlying biological mechanism (structural zeros), whereas others are indeed due to insufficient sequencing depth (sampling zeros), especially for loci that interact infrequently. Differentiating between structural zeros and sampling zeros is important since correct inference would improve downstream analyses such as clustering and discovery of subtypes. Nevertheless, distinguishing between these two types of zeros has received little attention in the single cell Hi-C literature, where the issue of sparsity has been addressed mainly as a data quality improvement problem. To fill this gap, in this paper, we propose HiCImpute, a Bayesian hierarchy model that goes beyond data quality improvement by also identifying observed zeros that are in fact structural zeros. HiCImpute takes spatial dependencies of scHi-C 2D data structure into account while also borrowing information from similar single cells and bulk data, when such are available. Through an extensive set of analyses of synthetic and real data, we demonstrate the ability of HiCImpute for identifying structural zeros with high sensitivity, and for accurate imputation of dropout values in sampling zeros. Downstream analyses using data improved from HiCImpute yielded much more accurate clustering of cell types compared to using observed data or data improved by several comparison methods. Most significantly, HiCImpute-improved data has led to the identification of subtypes within each of the excitatory neuronal cells of L4 and L5 in the prefrontal cortex.
- Published
- 2021
34. DNA Methylation
- Author
-
Kasper D. Hansen, Kimberly D. Siegmund, and Shili Lin
- Published
- 2019
35. EKG PARAMETERS IN PREDICTING SURVIVALS IN PULMONARY ARTERIAL HYPERTENSION
- Author
-
Allen D. Everett, Zilu Liu, Jidapa Kraisangka, Jacqueline V. Scott, Manreet Kanwar, Faezeh Movahedi, Shili Lin, Raymond L. Benza, Adam Perer, and James F. Antaki
- Subjects
Pulmonary and Respiratory Medicine ,medicine.medical_specialty ,business.industry ,Internal medicine ,medicine ,Cardiology ,Cardiology and Cardiovascular Medicine ,Critical Care and Intensive Care Medicine ,business - Published
- 2021
36. Are dropout imputation methods for scRNA-seq effective for scHi-C data?
- Author
-
Qing Xie, Chenggong Han, and Shili Lin
- Subjects
0303 health sciences ,Computer science ,Inference ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,RNA, Small Cytoplasmic ,Humans ,Computer Simulation ,RNA-Seq ,Data mining ,Imputation (statistics) ,Single-Cell Analysis ,Cluster analysis ,Molecular Biology ,computer ,Algorithms ,Software ,030217 neurology & neurosurgery ,Method Review ,030304 developmental biology ,Information Systems - Abstract
The prevalence of dropout events is a serious problem for single-cell Hi-C (scHiC) data due to insufficient sequencing depth and data coverage, which brings difficulties in downstream studies such as clustering and structural analysis. Complicating things further is the fact that dropouts are confounded with structural zeros due to underlying properties, leading to observed zeros being a mixture of both types of events. Although a great deal of progress has been made in imputing dropout events for single cell RNA-sequencing (RNA-seq) data, little has been done in identifying structural zeros and imputing dropouts for scHiC data. In this paper, we adapted several methods from the single-cell RNA-seq literature for inference on observed zeros in scHiC data and evaluated their effectiveness. Through an extensive simulation study and real data analysis, we have shown that a couple of the adapted single-cell RNA-seq algorithms can be powerful for correctly identifying structural zeros and accurately imputing dropout values. Downstream analysis using the imputed values showed considerable improvement for clustering cells of the same types together over clustering results before imputation.
- Published
- 2020
37. Author Correction: Temporal dynamic reorganization of 3D chromatin architecture in hormone-induced breast cancer and endocrine resistance
- Author
-
Yufan Zhou, Gary S. Stein, Junbai Wang, Tian Li, Shili Lin, Seth Frietze, Diana L. Gerrard, Mahitha Rajendran, Andrew J. Fritz, Rachel Schiff, Victor X. Jin, Yini Yang, and Xiaoyong Fu
- Subjects
Multidisciplinary ,business.industry ,Science ,Endocrine resistance ,General Physics and Astronomy ,General Chemistry ,medicine.disease ,Bioinformatics ,General Biochemistry, Genetics and Molecular Biology ,Chromatin ,Breast cancer ,Text mining ,Medicine ,lcsh:Q ,lcsh:Science ,business ,Hormone - Abstract
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
- Published
- 2020
38. Interrelationship between ADAMTS13 activity, von Willebrand factor, and complement activation in remission from immune‐mediated trhrombotic thrombocytopenic purpura
- Author
-
Chenggong Han, Lauren Jay, Spero R. Cataland, Haiwa Wu, Shangbin Yang, Camila Masias, and Shili Lin
- Subjects
Male ,Purpura, Thrombocytopenic, Idiopathic ,Purpura, Thrombotic Thrombocytopenic ,biology ,business.industry ,Remission Induction ,Thrombotic thrombocytopenic purpura ,ADAMTS13 Protein ,Hematology ,medicine.disease ,Thrombocytopenic purpura ,Adamts13 activity ,Complement system ,Immune system ,Von Willebrand factor ,von Willebrand Factor ,Immunology ,medicine ,biology.protein ,Humans ,Female ,business ,Complement Activation ,Biomarkers - Published
- 2020
39. The Association of ARMC5 with the Renin-Angiotensin-Aldosterone System, Blood Pressure and Glycemia in African Americans - Supplementary Tables.pdf
- Author
-
Joseph, Joshua, Xiaofei Zhou, Zilbermint, Mihail, Stratakis, Constantine, Faucz, Fabio, Lodish, Maya, Berthon, Annabel, Wilson, James, Hsueh, Willa, Golden, Sherita, and Shili Lin
- Abstract
Supplemental Tables for Joseph et al, "The Association of ARMC5 with the Renin-Angiotensin-Aldosterone System, Blood Pressure and Glycemia in African Americans" Submitted to The Journal of Clinical Endocrinology & MetabolismD
- Published
- 2020
- Full Text
- View/download PDF
40. The Association of ARMC5 with the Renin-Angiotensin-Aldosterone System, Blood Pressure and Glycemia in African Americans - Supplementary Tables
- Author
-
Joseph, Joshua, Xiaofei Zhou, Zilbermint, Mihail, Stratakis, Constantine, Faucz, Fabio, Lodish, Maya, Berthon, Annabel, Wilson, James, Hsueh, Willa, Golden, Sherita, and Shili Lin
- Abstract
Supplemental Tables for Joseph et al, "The Association of ARMC5 with the Renin-Angiotensin-Aldosterone System, Blood Pressure and Glycemia in African Americans" Submitted to The Journal of Clinical Endocrinology & Metabolism
- Published
- 2020
- Full Text
- View/download PDF
41. sj-pdf-1-smm-10.1177_0962280220927728 - Supplemental material for Detecting rare haplotypes associated with complex diseases using both population and family data: Combined logistic Bayesian Lasso
- Author
-
Xiaofei Zhou, Wang, Meng, and Shili Lin
- Subjects
111099 Nursing not elsewhere classified ,111708 Health and Community Services ,fungi ,160807 Sociological Methodology and Research Methods ,FOS: Health sciences ,FOS: Sociology - Abstract
Supplemental material, sj-pdf-1-smm-10.1177_0962280220927728 for Detecting rare haplotypes associated with complex diseases using both population and family data: Combined logistic Bayesian Lasso by Xiaofei Zhou, Meng Wang and Shili Lin in Statistical Methods in Medical Research
- Published
- 2020
- Full Text
- View/download PDF
42. sj-pdf-1-smm-10.1177_0962280220927728 - Supplemental material for Detecting rare haplotypes associated with complex diseases using both population and family data: Combined logistic Bayesian Lasso
- Author
-
Xiaofei Zhou, Wang, Meng, and Shili Lin
- Subjects
111099 Nursing not elsewhere classified ,111708 Health and Community Services ,fungi ,160807 Sociological Methodology and Research Methods ,FOS: Health sciences ,FOS: Sociology - Abstract
Supplemental material, sj-pdf-1-smm-10.1177_0962280220927728 for Detecting rare haplotypes associated with complex diseases using both population and family data: Combined logistic Bayesian Lasso by Xiaofei Zhou, Meng Wang and Shili Lin in Statistical Methods in Medical Research
- Published
- 2020
- Full Text
- View/download PDF
43. Additional file 1 of Modeling and analysis of Hi-C data by HiSIF identifies characteristic promoter-distal loops
- Author
-
Yufan Zhou, Xiaolong Cheng, Yini Yang, Li, Tian, Jingwei Li, Huang, Tim H.-M., Junbai Wang, Shili Lin, and Jin, Victor X.
- Subjects
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,ComputingMilieux_COMPUTERSANDEDUCATION ,Data_FILES ,ComputerApplications_COMPUTERSINOTHERSYSTEMS - Abstract
Additional file 1. Supplementary methods, figures and tables.
- Published
- 2020
- Full Text
- View/download PDF
44. Logistic Bayesian LASSO for detecting association combining family and case-control data
- Author
-
William C. L. Stewart, Han Zhang, Xiaofei Zhou, Meng Wang, and Shili Lin
- Subjects
0301 basic medicine ,Disequilibrium ,Haplotype ,lcsh:R ,lcsh:Medicine ,Single-nucleotide polymorphism ,General Medicine ,Quantitative trait locus ,01 natural sciences ,Data type ,General Biochemistry, Genetics and Molecular Biology ,Statistical power ,010104 statistics & probability ,03 medical and health sciences ,030104 developmental biology ,Polymorphism (computer science) ,Statistics ,medicine ,SNP ,lcsh:Q ,0101 mathematics ,medicine.symptom ,lcsh:Science ,Mathematics - Abstract
Because of the limited information from the GAW20 samples when only case-control or trio data are considered, we propose eLBL, an extension of the Logistic Bayesian LASSO (least absolute shrinkage and selection operator) methodology so that both types of data can be analyzed jointly in the hope of obtaining an increased statistical power, especially for detecting association between rare haplotypes and complex diseases. The methodology is further extended to account for familial correlation among the case-control individuals and the trios. A 2-step analysis strategy was taken to first perform a genome-wise single single-nucleotide polymorphism (SNP) search using the Monte Carlo pedigree disequilibrium test (MCPDT) to determine interesting regions for the Adult Treatment Panel (ATP) binary trait. Then eLBL was applied to haplotype blocks covering the flagged SNPs in Step 1. Several significantly associated haplotypes were identified; most are in blocks contained in protein coding genes that appear to be relevant for metabolic syndrome. The results are further substantiated with a Type I error study and by an additional analysis using the triglyceride measurements directly as a quantitative trait.
- Published
- 2018
45. Indirect effect inference and application to GAW20 data
- Author
-
Liming Li, Tianyuan Lu, Yue-Qing Hu, Chan Wang, and Shili Lin
- Subjects
0301 basic medicine ,lcsh:QH426-470 ,Inference ,Differentially methylated regions ,Single-nucleotide polymorphism ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,Genetics ,Humans ,Hypoglycemic Agents ,SNP ,Epigenetics ,Genetics (clinical) ,Genetic association ,Hypertriglyceridemia ,DNA methylation ,Research ,Genomics ,Indirect effect ,lcsh:Genetics ,030104 developmental biology ,Genetic marker ,Trait ,Genome-Wide Association Study - Abstract
Background Association studies using a single type of omics data have been successful in identifying disease-associated genetic markers, but the underlying mechanisms are unaddressed. To provide a possible explanation of how these genetic factors affect the disease phenotype, integration of multiple omics data is needed. Results We propose a novel method, LIPID (likelihood inference proposal for indirect estimation), that uses both single nucleotide polymorphism (SNP) and DNA methylation data jointly to analyze the association between a trait and SNPs. The total effect of SNPs is decomposed into direct and indirect effects, where the indirect effects are the focus of our investigation. Simulation studies show that LIPID performs better in various scenarios than existing methods. Application to the GAW20 data also leads to encouraging results, as the genes identified appear to be biologically relevant to the phenotype studied. Conclusions The proposed LIPID method is shown to be meritorious in extensive simulations and in real-data analyses.
- Published
- 2018
46. A Family-Based Rare Haplotype Association Method for Quantitative Traits
- Author
-
Swati Biswas, Ananda S. Datta, and Shili Lin
- Subjects
Male ,Mixed model ,0206 medical engineering ,02 engineering and technology ,Computational biology ,Quantitative trait locus ,Biology ,Population stratification ,Polymorphism, Single Nucleotide ,01 natural sciences ,Article ,010104 statistics & probability ,symbols.namesake ,Quantitative Trait, Heritable ,Missing heritability problem ,Genetics ,Humans ,0101 mathematics ,Genetics (clinical) ,Genetic association ,Models, Genetic ,Haplotype ,Bayes Theorem ,Markov chain Monte Carlo ,Heritability ,Pedigree ,Phenotype ,Haplotypes ,Cardiovascular Diseases ,symbols ,Female ,020602 bioinformatics ,Genome-Wide Association Study - Abstract
Background: The variants identified in genome-wide association studies account for only a small fraction of disease heritability. A key to this “missing heritability” is believed to be rare variants. Specifically, we focus on rare haplotype variant (rHTV). The existing methods for detecting rHTV are mostly population-based, and as such, are susceptible to population stratification and admixture, leading to an inflated false-positive rate. Family-based methods are more robust in this respect. Methods: We propose a method for detecting rHTVs associated with quantitative traits called family-based quantitative Bayesian LASSO (famQBL). FamQBL can analyze any type of pedigree and is based on a mixed model framework. We regularize the haplotype effects using Bayesian LASSO and estimate the posterior distributions using Markov chain Monte Carlo methods. Results: We conduct simulation studies, including analyses of Genetic Analysis Workshop 18 simulated data, to study the properties of famQBL and compare with a standard family-based haplotype association test implemented in FBAT (family-based association test) software. We find famQBL to be more powerful than FBAT with well-controlled false-positive rates. We also apply famQBL to the Framingham Heart Study data and detect an rHTV associated with diastolic blood pressure. Conclusion: FamQBL can help uncover rHTVs associated with quantitative traits.
- Published
- 2018
47. LAB PARAMETERS IN PREDICTING SURVIVAL IN PULMONARY ARTERIAL HYPERTENSION
- Author
-
Allen D. Everett, Jacqueline V. Scott, Raymond L. Benza, Jidapa Kraisangka, Zilu Liu, Adam Perer, Manreet Kanwar, Shili Lin, James F. Antaki, and Faezeh Movahedi
- Subjects
Pulmonary and Respiratory Medicine ,medicine.medical_specialty ,business.industry ,Internal medicine ,Cardiology ,medicine ,Cardiology and Cardiovascular Medicine ,Critical Care and Intensive Care Medicine ,business - Published
- 2021
48. Testing for Associations of Opposite Directionality in a Heterogeneous Population
- Author
-
Jie Ding, Shili Lin, and Fangyuan Zhang
- Subjects
0301 basic medicine ,Statistics and Probability ,Gene regulatory network ,Sample (statistics) ,Type (model theory) ,01 natural sciences ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Correlation ,010104 statistics & probability ,03 medical and health sciences ,Identification (information) ,030104 developmental biology ,Cross entropy ,Ranking ,Statistics ,Directionality ,0101 mathematics ,Mathematics - Abstract
In gene networks, it is possible that the patterns of gene co-expression may exist only in a subset of the sample. In studies of relationships between genotypes and expressions of genes over multiple tissues, there may be associations in some tissues but not in the others. Despite the importance of the problem in genomic applications, it is challenging to identify relationships between two variables when the correlation may only exist in a subset of the sample. The situation becomes even less tractable when there exist two subsets in which correlations are in opposite directions. By ranking subset relationships according to Kendall’s tau, a tau-path can be derived to facilitate the identification of correlated subsets, if such subsets exist. However, the current tau-path methodology only considers the situation in which there is association in a subsample; the more complex scenario depicting the existence of two subsets with opposite directionality of associations was not addressed. Further, existing algorithms for finding tau-paths may be suboptimal given their greedy nature. In this paper, we extend the tau-path methodology to accommodate the situation in which the sample may be drawn from a heterogeneous population composed of subpopulations portraying positive and negative associations. We also propose the use of a cross entropy Monte Carlo procedure to obtain an optimal tau-path, CEMC $$_{tp}$$ . The algorithm not only can provide simultaneous detection of positive and negative correlations in the same sample, but also can lead to the identification of subsamples that provide evidence for the detected associations. An extensive simulation study shows the aptness of CEMC $$_{tp}$$ for detecting associations under various scenarios. Compared with two standard tests for detecting associations, CEMC $$_{tp}$$ is seen to be more powerful when there are indeed complex subset associations with well-controlled type-I error rates. We applied CEMC $$_{tp}$$ to the NCI-60 gene expression data to illustrate its utility for uncovering network relationships that were missed with standard methods.
- Published
- 2017
49. Are rare variants really independent?
- Author
-
Asuman S. Turkmen and Shili Lin
- Subjects
0301 basic medicine ,Linkage disequilibrium ,Genotype ,Chromosomes, Human, Pair 21 ,Epidemiology ,Genetic Variation ,Genome-wide association study ,Computational biology ,030105 genetics & heredity ,Polychoric correlation ,Biology ,Polymorphism, Single Nucleotide ,Linkage Disequilibrium ,03 medical and health sciences ,030104 developmental biology ,Gene Frequency ,Humans ,Computer Simulation ,1000 Genomes Project ,Association mapping ,Categorical variable ,Genetics (clinical) ,Genome-Wide Association Study ,Genetic association ,Statistical hypothesis testing - Abstract
Recent advances in genotyping with high-density markers allow researchers access to genomic variants including rare ones. Linkage disequilibrium (LD) is widely used to provide insight into evolutionary history. It is also the basis for association mapping in humans and other species. Better understanding of the genomic LD structure may lead to better-informed statistical tests that can improve the power of association studies. Although rare variant associations with common diseases (RVCD) have been extensively studied recently, there is very limited understanding, and even controversial view of LD structures among rare variants and between rare and common variants. In fact, many popular RVCD tests make the assumptions that rare variants are independent. In this report, we show that two commonly used LD measures are not capable of detecting LD when rare variants are involved. We present this argument from two perspectives, both the LD measures themselves and the computational issues associated with them. To address these issues, we propose an alternative LD measure, the polychoric correlation, that was originally designed for detecting associations among categorical variables. Using simulated as well as the 1000 Genomes data, we explore the performances of LD measures in detail and discuss their implications in association studies.
- Published
- 2017
50. Relapse Prediction Model for Immune-Mediated Thrombotic Thrombocytopenic Purpura
- Author
-
Alcinda Flowers, Spero R. Cataland, Camila Masias, Haiwa Wu, Shangbin Yang, Senthil Sukumar, Krista Carter, Chenggon Han, and Shili Lin
- Subjects
Immune system ,business.industry ,Immunology ,Thrombotic thrombocytopenic purpura ,medicine ,Cell Biology ,Hematology ,medicine.disease ,business ,Biochemistry - Abstract
Background: Immune mediated thrombotic thrombocytopenic purpura (iTTP) is defined by thrombocytopenia and microangiopathic hemolytic anemia caused by severely deficient ADAMTS13 activity ( Patients/Methods: This analysis utilized samples from patients enrolled in the Ohio State University TMA registry beginning in 2003 until 2014. Patients were followed every 3 months to monitor for relapse and for the development of long-term complications. Clinical data (CBC, chemistry, LDH, ADAMTS13 biomarkers) were obtained on each visit to confirm remission in addition to research samples. Clinical and demographic data in addition to biomarkers of complement activation that were performed on banked samples were used to develop a model to quantify the risk for relapse in the following 3 months. A Lasso logistic regression model (Tibshirani et al 1996) with relapse in the following 3 months as the response variable using the R package glmnet (https://cran.r-project.org/web/packages/glmnet/index.html) was used to identify those variables most predictive of relapse from the 17 studied (Table 1). The data from all subjects were split into training data (75%) and testing data (25%) to develop and subsequently test the performance of the model. The final predictors were then standardized to develop the final model and the program to predict the risk of relapse (%) in the following 3 months. Results: Data from a total of 131 patient encounters from 42 patients in a clinical remission were included in the study to develop the statistical model (Table 2). 31(75.6%) were White and 10 (24.4%) were African American. The average age was 42 (18-69), and 31 (75.6%) were women. The median number of encounters from each patient was 2 (range, 1 to 10). From these 131 encounters, 39 relapses occurred in 20 patients over the next 3 months following their clinic visit. The 39 relapse encounters were compared to the 92 encounters where no relapse occurred to develop the model. The performance of both the 11-factor and 6-factor models are shown in Table 3. Given the comparable data in terms of the AUC, sensitivity and specificity to the 11-factor model to predict relapse, the 6-factor model was judged to be more practical given the fewer number of variables. Both the 11-factor and 6-factor models performed better than the ADAMTS13 activity alone. In this model, the presence of ULVWF multimers, increased levels of LDH, complement activation as measured by C3a (log transformed), a lower ADAMTS13 activity (log transformed), ADAMTS13 antigen (log transformed), and younger age increased the risk for relapse. These variables would be entered into the model with the result being read out as a percent risk for relapse in the following 3 months as described in the hypothetical examples in Table 4. Conclusion: Utilizing a model that includes a combination of biomarkers in asymptomatic patients in remission from iTTP provides more accurate identification of patients at increased risk of relapse in the next 3 months, when compared to using the ADAMTS13 activity alone. This model would allow physicians to initiate preemptive therapy with rituximab to patients at the greatest risk for relapse, and potentially avoiding therapy in patients whose risk may be lower than what would be predicted by the ADAMTS13 activity alone. Disclosures Cataland: Ablynx/Sanofi: Consultancy, Research Funding; Alexion: Consultancy, Research Funding.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.