26 results on '"Sun, Zhifu"'
Search Results
2. Differences in olfactory habituation between orthonasal and retronasal pathways
- Author
-
Xiao, Wei, Sun, Zhifu, Yan, Xiaoguang, Gao, Xing, Lv, Qianwen, and Wei, Yongxiang
- Published
- 2021
- Full Text
- View/download PDF
3. Association between ALS and retroviruses: evidence from bioinformatics analysis
- Author
-
Klein, Jon P., Sun, Zhifu, and Staff, Nathan P.
- Published
- 2019
- Full Text
- View/download PDF
4. Predict drug sensitivity of cancer cells with pathway activity inference
- Author
-
Wang, Xuewei, Sun, Zhifu, Zimmermann, Michael T., Bugrim, Andrej, and Kocher, Jean-Pierre
- Published
- 2019
- Full Text
- View/download PDF
5. Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine
- Author
-
Prodduturi, Naresh, Bhagwate, Aditya, Kocher, Jean-Pierre A., and Sun, Zhifu
- Published
- 2018
- Full Text
- View/download PDF
6. Estrogen receptor-beta sensitizes breast cancer cells to the anti-estrogenic actions of endoxifen
- Author
-
Wu, Xianglin, Subramaniam, Malayannan, Grygo, Sarah B, Sun, Zhifu, Negron, Vivian, Lingle, Wilma L, Goetz, Matthew P, Ingle, James N, Spelsberg, Thomas C, and Hawse, John R
- Published
- 2011
- Full Text
- View/download PDF
7. Can gene expression profiling predict survival for patients with squamous cell carcinoma of the lung?
- Author
-
Endo Chiaki, Kosari Farhad, Aubry Marie-Christine, Yang Ping, Sun Zhifu, Molina Julian, and Vasmatzis George
- Subjects
Male ,Lung Neoplasms ,Research ,Gene Expression Profiling ,lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,Prognosis ,lcsh:RC254-282 ,Survival Analysis ,Case-Control Studies ,Carcinoma, Squamous Cell ,Cluster Analysis ,Humans ,Female ,Oligonucleotide Array Sequence Analysis - Abstract
Background Lung cancer remains to be the leading cause of cancer death worldwide. Patients with similar lung cancer may experience quite different clinical outcomes. Reliable molecular prognostic markers are needed to characterize the disparity. In order to identify the genes responsible for the aggressiveness of squamous cell carcinoma of the lung, we applied DNA microarray technology to a case control study. Fifteen patients with surgically treated stage I squamous cell lung cancer were selected. Ten were one-to-one matched on tumour size and grade, age, gender, and smoking status; five died of lung cancer recurrence within 24 months (high-aggressive group), and five survived more than 54 months after surgery (low-aggressive group). Five additional tissues were included as test samples. Unsupervised and supervised approaches were used to explore the relationship among samples and identify differentially expressed genes. We also evaluated the gene markers' accuracy in segregating samples to their respective group. Functional gene networks for the significant genes were retrieved, and their association with survival was tested. Results Unsupervised clustering did not group tumours based on survival experience. At p < 0.05, 294 and 246 differentially expressed genes for matched and unmatched analysis respectively were identified between the low and high aggressive groups. Linear discriminant analysis was performed on all samples using the 27 top unique genes, and the results showed an overall accuracy rate of 80%. Tests on the association of 24 gene networks with study outcome showed that 7 were highly correlated with the survival time of the lung cancer patients. Conclusion The overall gene expression pattern between the high and low aggressive squamous cell carcinomas of the lung did not differ significantly with the control of confounding factors. A small subset of genes or genes in specific pathways may be responsible for the aggressive nature of a tumour and could potentially serve as panels of prognostic markers for stage I squamous cell lung cancer.
- Published
- 2004
8. Conserved recurrent gene mutations correlate with pathway deregulation and clinical outcomes of lung adenocarcinoma in never-smokers.
- Author
-
Sun, Zhifu, Wang, Liang, Eckloff, Bruce W., Bo Deng, Yi Wang, Wampfler, Jason A., Jang, JinSung, Wieben, Eric D., Jen, Jin, You, Ming, and Yang, Ping
- Subjects
- *
LUNG cancer , *MESSENGER RNA , *ADENOCARCINOMA , *LIFE sciences , *GENOTYPE-environment interaction , *DNA , *GENETICS , *PHYSIOLOGY - Abstract
Background Novel and targetable mutations are needed for improved understanding and treatment of lung cancer in never-smokers. Methods Twenty-seven lung adenocarcinomas from never-smokers were sequenced by both exome and mRNA-seq with respective normal tissues. Somatic mutations were detected and compared with pathway deregulation, tumor phenotypes and clinical outcomes. Results Although somatic mutations in DNA or mRNA ranged from hundreds to thousands in each tumor, the overlap mutations between the two were only a few to a couple of hundreds. The number of somatic mutations from either DNA or mRNA was not significantly associated with clinical variables; however, the number of overlap mutations was associated with cancer subtype. These overlap mutants were preferentially expressed in mRNA with consistently higher allele frequency in mRNA than in DNA. Ten genes (EGFR, TP53, KRAS, RPS6KB2, ATXN2, DHX9, PTPN13, SP1, SPTAN1 and MYOF) had recurrent mutations and these mutations were highly correlated with pathway deregulation and patient survival. Conclusions The recurrent mutations present in both DNA and RNA are likely the driver for tumor biology, pathway deregulation and clinical outcomes. The information may be used for patient stratification and therapeutic target development. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
9. NUCLIZE for quantifying epigenome: generating histone modification data at single-nucleosome resolution using genuine nucleosome positions.
- Author
-
Zheng, Daoshan, Trynda, Justyna, Sun, Zhifu, and Li, Zhaoyu
- Subjects
EPIGENOMICS ,HISTONES ,PHYSIOLOGICAL control systems ,MODIFICATIONS ,GENETIC code ,CHROMATIN ,DATA mapping - Abstract
Background: Defining histone modification at single-nucleosome resolution provides accurate epigenomic information in individual nucleosomes. However, most of histone modification data deposited in current databases, such as ENCODE and Roadmap, have low resolution with peaks of several kilo-base pairs (kb), which due to the technical defects of regular ChIP-Seq technology. Results: To generate histone modification data at single-nucleosome resolution, we developed a novel approach, NUCLIZE, using synergistic analyses of histone modification data from ChIP-Seq and high-resolution nucleosome mapping data from native MNase-Seq. With this approach, we generated quantitative epigenomics data of single and multivalent histone modification marks in each nucleosome. We found that the dominant trivalent histone mark (H3K4me3/H3K9ac/H3K27ac) and others showed defined and specific patterns near each TSS, indicating potential epigenetic codes regulating gene transcription. Conclusions: Single-nucleosome histone modification data render epigenomic data become quantitative, which is essential for investigating dynamic changes of epigenetic regulation in the biological process or for functional epigenomics studies. Thus, NUCLIZE turns current epigenomic mapping studies into genuine functional epigenomics studies with quantitative epigenomic data. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
10. Pharmacologic reversion of epigenetic silencing of the PRKD1 promoter blocks breast tumor cell invasion and metastasis.
- Author
-
Borges, Sahra, Döppler, Heike, Perez, Edith A, Andorfer, Cathy A, Sun, Zhifu, Anastasiadis, Panos Z, Thompson, E, Geiger, Xochiquetzal J, and Storz, Peter
- Abstract
Introduction: DNA methylation-induced silencing of genes encoding tumor suppressors is common in many types of cancer, but little is known about how such epigenetic silencing can contribute to tumor metastasis. The PRKD1 gene encodes protein kinase D1 (PKD1), a serine/threonine kinase that is expressed in cells of the normal mammary gland, where it maintains the epithelial phenotype by preventing epithelial-to-mesenchymal transition.Methods: The status of PRKD1 promoter methylation was analyzed by reduced representation bisulfite deep sequencing, methylation-specific PCR (MSP-PCR) and in situ MSP-PCR in invasive and noninvasive breast cancer lines, as well as in humans in 34 cases of "normal" tissue, 22 cases of ductal carcinoma in situ, 22 cases of estrogen receptor positive, HER2-negative (ER+/HER2-) invasive lobular carcinoma, 43 cases of ER+/HER2- invasive ductal carcinoma (IDC), 93 cases of HER2+ IDC and 96 cases of triple-negative IDC. A reexpression strategy using the DNA methyltransferase inhibitor decitabine was used in vitro in MDA-MB-231 cells as well as in vivo in a tumor xenograft model and measured by RT-PCR, immunoblotting and immunohistochemistry. The effect of PKD1 reexpression on cell invasion was analyzed in vitro by transwell invasion assay. Tumor growth and metastasis were monitored in vivo using the IVIS Spectrum Pre-clinical In Vivo Imaging System.Results: Herein we show that the gene promoter of PRKD1 is aberrantly methylated and silenced in its expression in invasive breast cancer cells and during breast tumor progression, increasing with the aggressiveness of tumors. Using an animal model, we show that reversion of PRKD1 promoter methylation with the DNA methyltransferase inhibitor decitabine restores PKD1 expression and blocks tumor spread and metastasis to the lung in a PKD1-dependent fashion.Conclusions: Our data suggest that the status of epigenetic regulation of the PRKD1 promoter can provide valid information on the invasiveness of breast tumors and therefore could serve as an early diagnostic marker. Moreover, targeted upregulation of PKD1 expression may be used as a therapeutic approach to reverse the invasive phenotype of breast cancer cells. [ABSTRACT FROM AUTHOR]- Published
- 2013
- Full Text
- View/download PDF
11. Genetic association with overall survival of taxane-treated lung cancer patients - a genome-wide association study in human lymphoblastoid cell lines followed by a clinical association study.
- Author
-
Niu, Nifang, Schaid, Daniel J, Abo, Ryan P, Kalari, Krishna, Fridley, Brooke L, Feng, Qiping, Jenkins, Gregory, Batzler, Anthony, Brisbin, Abra G, Cunningham, Julie M, Li, Liang, Sun, Zhifu, Yang, Ping, and Wang, Liewei
- Abstract
Background: Taxane is one of the first line treatments of lung cancer. In order to identify novel single nucleotide polymorphisms (SNPs) that might contribute to taxane response, we performed a genome-wide association study (GWAS) for two taxanes, paclitaxel and docetaxel, using 276 lymphoblastoid cell lines (LCLs), followed by genotyping of top candidate SNPs in 874 lung cancer patient samples treated with paclitaxel.Methods: GWAS was performed using 1.3 million SNPs and taxane cytotoxicity IC50 values for 276 LCLs. The association of selected SNPs with overall survival in 76 small or 798 non-small cell lung cancer (SCLC, NSCLC) patients were analyzed by Cox regression model, followed by integrated SNP-microRNA-expression association analysis in LCLs and siRNA screening of candidate genes in SCLC (H196) and NSCLC (A549) cell lines.Results: 147 and 180 SNPs were associated with paclitaxel or docetaxel IC50s with p-values <10-4 in the LCLs, respectively. Genotyping of 153 candidate SNPs in 874 lung cancer patient samples identified 8 SNPs (p-value < 0.05) associated with either SCLC or NSCLC patient overall survival. Knockdown of PIP4K2A, CCT5, CMBL, EXO1, KMO and OPN3, genes within 200 kb up-/downstream of the 3 SNPs that were associated with SCLC overall survival (rs1778335, rs2662411 and rs7519667), significantly desensitized H196 to paclitaxel. SNPs rs2662411 and rs1778335 were associated with mRNA expression of CMBL or PIP4K2A through microRNA (miRNA) hsa-miR-584 or hsa-miR-1468.Conclusions: GWAS in an LCL model system, joined with clinical translational and functional studies, might help us identify genetic variations associated with overall survival of lung cancer patients treated paclitaxel. [ABSTRACT FROM AUTHOR]- Published
- 2012
- Full Text
- View/download PDF
12. Erratum to: Conserved recurrent gene mutations correlate with pathway deregulation and clinical outcomes of lung adenocarcinoma in never-smokers.
- Author
-
Sun Z, Wang L, Eckloff BW, Deng B, Wang Y, Wampfler JA, Jang J, Wieben ED, Jen J, You M, and Yang P
- Published
- 2017
- Full Text
- View/download PDF
13. Targeted alignment and end repair elimination increase alignment and methylation measure accuracy for reduced representation bisulfite sequencing data.
- Author
-
Baheti S, Kanwar R, Goelzenleuchter M, Kocher JP, Beutler AS, and Sun Z
- Subjects
- Algorithms, CpG Islands, Humans, Polymorphism, Single Nucleotide, Sequence Alignment, Cytosine, DNA Methylation, Epigenesis, Genetic, Genomics methods, Sequence Analysis, DNA methods
- Abstract
Background: DNA methylation is an important epigenetic modification involved in many biological processes. Reduced representation bisulfite sequencing (RRBS) is a cost-effective method for studying DNA methylation at single base resolution. Although several tools are available for RRBS data processing and analysis, it is not clear which strategy performs the best and there has not been much attention to the contamination issue from artificial cytosines incorporated during the end repair step of library preparation. To address these issues, we describe a new method, Targeted Alignment and Artificial Cytosine Elimination for RRBS (TRACE-RRBS), which aligns bisulfite sequence reads to MSP1 digitally digested reference and specifically removes the end repair cytosines. We compared this approach on a simulated and a real dataset with 7 other RRBS analysis tools and Illumina 450 K microarray platform., Results: TRACE-RRBS aligns sequence reads to a small fraction of the genome where RRBS protocol targets on and was demonstrated as the fastest, most sensitive and specific tool for the simulated dataset. For the real dataset, TRACE-RRBS took about the same time as RRBSMAP, a third to a sixth of time needed for BISMARK and NOVOALIGN. TRACE-RRBS aligned more reads uniquely than other tools and achieved the highest correlation with 450 k microarray data. The end repair artificial cytosine removal increased correlation between nearby CpGs and accuracy of methylation quantification., Conclusions: TRACE-RRBS is fast and more accurate tool for RRBS data analysis. It is freely available for academic use at http://bioinformaticstools.mayo.edu/.
- Published
- 2016
- Full Text
- View/download PDF
14. HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.
- Author
-
Yan H, Evans J, Kalmbach M, Moore R, Middha S, Luban S, Wang L, Bhagwate A, Li Y, Sun Z, Chen X, and Kocher JP
- Subjects
- Animals, Binding Sites, Chromosome Mapping, Data Interpretation, Statistical, Humans, Mice, Molecular Sequence Annotation, Transcription Factors metabolism, Chromatin Immunoprecipitation methods, Genomics methods, High-Throughput Nucleotide Sequencing methods, Software
- Abstract
Background: Chromatin immunoprecipitation (ChIP) followed by next-generation sequencing (ChIP-Seq) has been widely used to identify genomic loci of transcription factor (TF) binding and histone modifications. ChIP-Seq data analysis involves multiple steps from read mapping and peak calling to data integration and interpretation. It remains challenging and time-consuming to process large amounts of ChIP-Seq data derived from different antibodies or experimental designs using the same approach. To address this challenge, there is a need for a comprehensive analysis pipeline with flexible settings to accelerate the utilization of this powerful technology in epigenetics research., Results: We have developed a highly integrative pipeline, termed HiChIP for systematic analysis of ChIP-Seq data. HiChIP incorporates several open source software packages selected based on internal assessments and published comparisons. It also includes a set of tools developed in-house. This workflow enables the analysis of both paired-end and single-end ChIP-Seq reads, with or without replicates for the characterization and annotation of both punctate and diffuse binding sites. The main functionality of HiChIP includes: (a) read quality checking; (b) read mapping and filtering; (c) peak calling and peak consistency analysis; and (d) result visualization. In addition, this pipeline contains modules for generating binding profiles over selected genomic features, de novo motif finding from transcription factor (TF) binding sites and functional annotation of peak associated genes., Conclusions: HiChIP is a comprehensive analysis pipeline that can be configured to analyze ChIP-Seq data derived from varying antibodies and experiment designs. Using public ChIP-Seq data we demonstrate that HiChIP is a fast and reliable pipeline for processing large amounts of ChIP-Seq data.
- Published
- 2014
- Full Text
- View/download PDF
15. Clinical biomarkers of pulmonary carcinoid tumors in never smokers via profiling miRNA and target mRNA.
- Author
-
Deng B, Molina J, Aubry MC, Sun Z, Wang L, Eckloff BW, Vasmatzis G, You M, Wieben ED, Jen J, Wigle DA, and Yang P
- Abstract
Background: miRNAs play key regulatory roles in cellular pathological processes. We aimed to identify clinically meaningful biomarkers in pulmonary carcinoid tumors (PCTs), a member of neuroendocrine neoplasms, via profiling miRNAs and mRNAs., Results: From the total of 1145 miRNAs, we obtained 16 and 17 miRNAs that showed positive and negative fold changes (FCs, tumors vs. normal tissues) in the top 1% differentially expressed miRNAs, respectively. We uncovered the target genes that were predicted by at least two prediction tools and overlapped by at least one-half of the top miRNAs, which yielded 44 genes (FC<-2) and 56 genes (FC>2), respectively. Higher expressions of CREB5, PTPRB and COL4A3 predicted favorable disease free survival (Hazard ratio: 0.03, 0.19 and 0.36; P value: 0.03, 0.03 and 0.08). Additionally, 79 mutated genes have been found in nine PCTs where TP53 was the only repeated mutation., Conclusion: We identified that the expressions of three genes have clinical implications in PCTs. The biological functions of these biomarkers warrant further studies.
- Published
- 2014
- Full Text
- View/download PDF
16. CAP-miRSeq: a comprehensive analysis pipeline for microRNA sequencing data.
- Author
-
Sun Z, Evans J, Bhagwate A, Middha S, Bockol M, Yan H, and Kocher JP
- Subjects
- Carcinoma, Renal Cell genetics, High-Throughput Nucleotide Sequencing, Humans, Internet, MCF-7 Cells, Reproducibility of Results, Software, User-Computer Interface, Computational Biology methods, MicroRNAs genetics, Sequence Analysis, RNA methods
- Abstract
Background: miRNAs play a key role in normal physiology and various diseases. miRNA profiling through next generation sequencing (miRNA-seq) has become the main platform for biological research and biomarker discovery. However, analyzing miRNA sequencing data is challenging as it needs significant amount of computational resources and bioinformatics expertise. Several web based analytical tools have been developed but they are limited to processing one or a pair of samples at time and are not suitable for a large scale study. Lack of flexibility and reliability of these web applications are also common issues., Results: We developed a Comprehensive Analysis Pipeline for microRNA Sequencing data (CAP-miRSeq) that integrates read pre-processing, alignment, mature/precursor/novel miRNA detection and quantification, data visualization, variant detection in miRNA coding region, and more flexible differential expression analysis between experimental conditions. According to computational infrastructure, users can install the package locally or deploy it in Amazon Cloud to run samples sequentially or in parallel for a large number of samples for speedy analyses. In either case, summary and expression reports for all samples are generated for easier quality assessment and downstream analyses. Using well characterized data, we demonstrated the pipeline's superior performances, flexibility, and practical use in research and biomarker discovery., Conclusions: CAP-miRSeq is a powerful and flexible tool for users to process and analyze miRNA-seq data scalable from a few to hundreds of samples. The results are presented in the convenient way for investigators or analysts to conduct further investigation and discovery.
- Published
- 2014
- Full Text
- View/download PDF
17. Characterization of human plasma-derived exosomal RNAs by deep sequencing.
- Author
-
Huang X, Yuan T, Tschannen M, Sun Z, Jacob H, Du M, Liang M, Dittmar RL, Liu Y, Liang M, Kohli M, Thibodeau SN, Boardman L, and Wang L
- Subjects
- Base Sequence, Blood Donors, Chromosome Mapping, Extracellular Space genetics, Humans, MicroRNAs chemistry, MicroRNAs genetics, RNA Stability, Transcriptome, Exosomes genetics, High-Throughput Nucleotide Sequencing, Plasma cytology, Sequence Analysis, RNA
- Abstract
Background: Exosomes, endosome-derived membrane microvesicles, contain specific RNA transcripts that are thought to be involved in cell-cell communication. These RNA transcripts have great potential as disease biomarkers. To characterize exosomal RNA profiles systemically, we performed RNA sequencing analysis using three human plasma samples and evaluated the efficacies of small RNA library preparation protocols from three manufacturers. In all we evaluated 14 libraries (7 replicates)., Results: From the 14 size-selected sequencing libraries, we obtained a total of 101.8 million raw single-end reads, an average of about 7.27 million reads per library. Sequence analysis showed that there was a diverse collection of the exosomal RNA species among which microRNAs (miRNAs) were the most abundant, making up over 42.32% of all raw reads and 76.20% of all mappable reads. At the current read depth, 593 miRNAs were detectable. The five most common miRNAs (miR-99a-5p, miR-128, miR-124-3p, miR-22-3p, and miR-99b-5p) collectively accounted for 48.99% of all mappable miRNA sequences. MiRNA target gene enrichment analysis suggested that the highly abundant miRNAs may play an important role in biological functions such as protein phosphorylation, RNA splicing, chromosomal abnormality, and angiogenesis. From the unknown RNA sequences, we predicted 185 potential miRNA candidates. Furthermore, we detected significant fractions of other RNA species including ribosomal RNA (9.16% of all mappable counts), long non-coding RNA (3.36%), piwi-interacting RNA (1.31%), transfer RNA (1.24%), small nuclear RNA (0.18%), and small nucleolar RNA (0.01%); fragments of coding sequence (1.36%), 5' untranslated region (0.21%), and 3' untranslated region (0.54%) were also present. In addition to the RNA composition of the libraries, we found that the three tested commercial kits generated a sufficient number of DNA fragments for sequencing but each had significant bias toward capturing specific RNAs., Conclusions: This study demonstrated that a wide variety of RNA species are embedded in the circulating vesicles. To our knowledge, this is the first report that applied deep sequencing to discover and characterize profiles of plasma-derived exosomal RNAs. Further characterization of these extracellular RNAs in diverse human populations will provide reference profiles and open new doors for the development of blood-based biomarkers for human diseases.
- Published
- 2013
- Full Text
- View/download PDF
18. Sequence analysis of Epstein-Barr virus EBNA-2 gene coding amino acid 148-487 in nasopharyngeal and gastric carcinomas.
- Author
-
Wang X, Wang Y, Wu G, Chao Y, Sun Z, and Luo B
- Subjects
- Amino Acid Sequence, Amino Acid Substitution, Humans, Molecular Sequence Data, Mutation, Nasopharyngeal Carcinoma, Polymorphism, Genetic, Protein Structure, Tertiary, Sequence Analysis, DNA, Carcinoma virology, Epstein-Barr Virus Nuclear Antigens chemistry, Epstein-Barr Virus Nuclear Antigens genetics, Nasopharyngeal Neoplasms virology, Stomach Neoplasms virology, Viral Proteins chemistry, Viral Proteins genetics
- Abstract
Background: The Epstein-Barr virus (EBV) nuclear antigen 2 (EBNA-2) plays a key role in the B-cell growth transformation by initiating and maintaining the proliferation of infected B-cell upon EBV infection in vitro. Most studies about EBNA-2 have focused on its functions yet little is known for its intertypic polymorphisms., Results: Coding region for amino acid (aa) 148-487 of the EBNA-2 gene was sequenced in 25 EBV-associated gastric carcinomas (EBVaGCs), 56 nasopharyngeal carcinomas (NPCs) and 32 throat washings (TWs) from healthy donors in Northern China. Three variations (g48991t, c48998a, t49613a) were detected in all of the samples (113/113, 100%). EBNA-2 could be classified into four distinct subtypes: E2-A, E2-B, E2-C and E2-D based on the deletion status of three aa (294Q, 357K and 358G). Subtypes E2-A and E2-C were detected in 56/113 (49.6%), 38/113 (33.6%) samples, respectively. E2-A was observed more in EBVaGCs samples and subtype E2-D was only detected in the NPC samples. Variation analysis in EBNA-2 functional domains: the TAD residue (I438L) and the NLS residues (E476G, P484H and I486T) were only detected in NPC samples which located in the carboxyl terminus of EBNA-2 gene., Conclusions: The subtypes E2-A and E2-C were the dominant genotypes of the EBNA-2 gene in Northern China. The subtype E2-D may be associated with the tumorigenesis of NPC. The NPC isolates were prone harbor to more mutations than the other two groups in the functional domains.
- Published
- 2012
- Full Text
- View/download PDF
19. Batch effect correction for genome-wide methylation data with Illumina Infinium platform.
- Author
-
Sun Z, Chai HS, Wu Y, White WM, Donkena KV, Klein CJ, Garovic VD, Therneau TM, and Kocher JP
- Subjects
- Adult, CpG Islands genetics, Humans, Male, Reproducibility of Results, DNA Methylation genetics, Databases, Genetic, Genome, Human genetics, Oligonucleotide Array Sequence Analysis methods
- Abstract
Background: Genome-wide methylation profiling has led to more comprehensive insights into gene regulation mechanisms and potential therapeutic targets. Illumina Human Methylation BeadChip is one of the most commonly used genome-wide methylation platforms. Similar to other microarray experiments, methylation data is susceptible to various technical artifacts, particularly batch effects. To date, little attention has been given to issues related to normalization and batch effect correction for this kind of data., Methods: We evaluated three common normalization approaches and investigated their performance in batch effect removal using three datasets with different degrees of batch effects generated from HumanMethylation27 platform: quantile normalization at average β value (QNβ); two step quantile normalization at probe signals implemented in "lumi" package of R (lumi); and quantile normalization of A and B signal separately (ABnorm). Subsequent Empirical Bayes (EB) batch adjustment was also evaluated., Results: Each normalization could remove a portion of batch effects and their effectiveness differed depending on the severity of batch effects in a dataset. For the dataset with minor batch effects (Dataset 1), normalization alone appeared adequate and "lumi" showed the best performance. However, all methods left substantial batch effects intact in the datasets with obvious batch effects and further correction was necessary. Without any correction, 50 and 66 percent of CpGs were associated with batch effects in Dataset 2 and 3, respectively. After QNβ, lumi or ABnorm, the number of CpGs associated with batch effects were reduced to 24, 32, and 26 percent for Dataset 2; and 37, 46, and 35 percent for Dataset 3, respectively. Additional EB correction effectively removed such remaining non-biological effects. More importantly, the two-step procedure almost tripled the numbers of CpGs associated with the outcome of interest for the two datasets., Conclusion: Genome-wide methylation data from Infinium Methylation BeadChip can be susceptible to batch effects with profound impacts on downstream analyses and conclusions. Normalization can reduce part but not all batch effects. EB correction along with normalization is recommended for effective batch effect removal.
- Published
- 2011
- Full Text
- View/download PDF
20. Sequence analysis of the Epstein-Barr virus (EBV) BRLF1 gene in nasopharyngeal and gastric carcinomas.
- Author
-
Jia Y, Wang Y, Chao Y, Jing Y, Sun Z, and Luo B
- Subjects
- Cluster Analysis, DNA, Viral chemistry, DNA, Viral genetics, Genotype, Herpesvirus 4, Human isolation & purification, Humans, Phylogeny, Sequence Analysis, DNA, Carcinoma virology, Epstein-Barr Virus Infections virology, Herpesvirus 4, Human genetics, Immediate-Early Proteins genetics, Pharyngeal Neoplasms virology, Polymorphism, Genetic, Stomach Neoplasms virology, Trans-Activators genetics
- Abstract
Background: Epstein-Barr virus (EBV) has a biphasic infection cycle consisting of a latent and a lytic replicative phase. The product of immediate-early gene BRLF1, Rta, is able to disrupt the latency phase in epithelial cells and certain B-cell lines. The protein Rta is a frequent target of the EBV-induced cytotoxic T cell response. In spite of our good understanding of this protein, little is known for the gene polymorphism of BRLF1., Results: BRLF1 gene was successfully amplified in 34 EBV-associated gastric carcinomas (EBVaGCs), 57 nasopharyngeal carcinomas (NPCs) and 28 throat washings (TWs) samples from healthy donors followed by PCR-direct sequencing. Fourteen loci were found to be affected by amino acid changes, 17 loci by silent nucleotide changes. According to the phylogenetic tree, 5 distinct subtypes of BRLF1 were identified, and 2 subtypes BR1-A and BR1-C were detected in 42.9% (51/119), 42.0% (50/119) of samples, respectively. The distribution of these 2 subtypes among 3 types of specimens was significantly different. The subtype BR1-A preferentially existed in healthy donors, while BR1-C was seen more in biopsies of NPC. A silent mutation A/G was detected in all the isolates. Among 3 functional domains, the dimerization domain of Rta showed a stably conserved sequence, while DNA binding and transactivation domains were detected to have multiple mutations. Three of 16 CTL epitopes, NAA, QKE and ERP, were affected by amino acid changes. Epitope ERP was relatively conserved; epitopes NAA and QKE harbored more mutations., Conclusions: This first detailed investigation of sequence variations in BRLF1 gene has identified 5 distinct subtypes. Two subtypes BR1-A and BR1-C are the dominant genotypes of BRLF1. The subtype BR1-C is more frequent in NPCs, while BR1-A preferentially presents in healthy donors. BR1-C may be associated with the tumorigenesis of NPC.
- Published
- 2010
- Full Text
- View/download PDF
21. Impact of sample acquisition and linear amplification on gene expression profiling of lung adenocarcinoma: laser capture micro-dissection cell-sampling versus bulk tissue-sampling.
- Author
-
Klee EW, Erdogan S, Tillmans L, Kosari F, Sun Z, Wigle DA, Yang P, Aubry MC, and Vasmatzis G
- Abstract
Background: The methods used for sample selection and processing can have a strong influence on the expression values obtained through microarray profiling. Laser capture microdissection (LCM) provides higher specificity in the selection of target cells compared to traditional bulk tissue selection methods, but at an increased processing cost. The benefit gained from the higher tissue specificity realized through LCM sampling is evaluated in this study through a comparison of microarray expression profiles obtained from same-samples using bulk and LCM processing., Methods: Expression data from ten lung adenocarcinoma samples and six adjacent normal samples were acquired using LCM and bulk sampling methods. Expression values were evaluated for correlation between sample processing methods, as well as for bias introduced by the additional linear amplification required for LCM sample profiling., Results: The direct comparison of expression values obtained from the bulk and LCM sampled datasets reveals a large number of probesets with significantly varied expression. Many of these variations were shown to be related to bias arising from the process of linear amplification, which is required for LCM sample preparation. A comparison of differentially expressed genes (cancer vs. normal) selected in the bulk and LCM datasets also showed substantial differences. There were more than twice as many down-regulated probesets identified in the LCM data than identified in the bulk data. Controlling for the previously identified amplification bias did not have a substantial impact on the differences identified in the differentially expressed probesets found in the bulk and LCM samples., Conclusion: LCM-coupled microarray expression profiling was shown to uniquely identify a large number of differentially expressed probesets not otherwise found using bulk tissue sampling. The information gain realized from the LCM sampling was limited to differential analysis, as the absolute expression values obtained for some probesets using this study's protocol were biased during the second round of amplification. Consequently, LCM may enable investigators to obtain additional information in microarray studies not easily found using bulk tissue samples, but it is of critical importance that potential amplification biases are controlled for.
- Published
- 2009
- Full Text
- View/download PDF
22. Linkage analysis using principal components of gene expression data.
- Author
-
Atkinson EJ, Fridley BL, Goode EL, McDonnell SK, Liu-Mares W, Rabe KG, Sun Z, Slager SL, and de Andrade M
- Abstract
The goal of this paper is to investigate the effect of using principal components as a data reduction method for expression data in linkage analysis. We used 45 probes normalized using the Affymetrix Global Scaling that had evidence of high heritability to estimate the first 10 principal components (PC). A genome-wide linkage scan was performed on the 45 expression values and the 10 PCs using 2272 single-nucleotide polymorphisms. Our conclusions were: 1) PC analyses under-performed the single-probe analysis for known signals; 2) the PC that best reproduced the single-probe analysis was primarily composed of that probe; 3) no new signals were detected in the PC analysis; 4) no new pleiotropic effects were detected in the PC analysis.
- Published
- 2007
- Full Text
- View/download PDF
23. Analysis of variation in NF-kappaB genes and expression levels of NF-kappaB-regulated molecules.
- Author
-
Liu-Mares W, Sun Z, Bamlet WR, Atkinson EJ, Fridley BL, Slager SL, de Andrade M, and Goode EL
- Abstract
The nuclear factor-kappaB (NF-kappaB) family of transcription factors regulates the expression of a variety of genes involved in apoptosis and immune response. We examined relationships between genotypes at five NF-kappaB subunits (NFKB1, NFKB2, REL, RELA, and RELB) and variable expression levels of 15 NF-kappaB regulated proteins with heritability greater than 0.40: BCL2A1, BIRC2, CD40, CD44, CD80, CFLAR, CR2, FAS, ICAM1, IL15, IRF1, JUNB, MYC, SLC2A5, and VCAM1. SNP genotypes and expression phenotypes from pedigrees of Utah residents with ancestry from northern and western Europe were provided by Genetic Analysis Workshop 15 and supplemented with additional genotype data from the International HapMap Consortium. We conducted association, linkage, and family-based association analyses between each candidate gene and the 15 heritable expression phenotypes. We observed consistent results in association and linkage analyses of the NFKB1 region (encoding p50) and levels of FAS and IRF1 expression. FAS is a cell surface protein that also belongs to the TNF-receptor family; signals through FAS are able to induce apoptosis. IRF1 is a member of the interferon regulatory transcription factor family, which has been shown to regulate apoptosis and tumor-suppression. Analyses in the REL region (encoding c-Rel) revealed linkage and association with CD40 phenotype. CD40 proteins belong to the tumor necrosis factor (TNF)-receptor family, which mediates a broad variety of immune and inflammatory responses. We conclude that variation in the genes encoding p50 and c-Rel may play a role in NF-kappaB-related transcription of FAS, IRF1, and CD40.
- Published
- 2007
- Full Text
- View/download PDF
24. Comparison of tagging single-nucleotide polymorphism methods in association analyses.
- Author
-
Goode EL, Fridley BL, Sun Z, Atkinson EJ, Nord AS, McDonnell SK, Jarvik GP, de Andrade M, and Slager SL
- Abstract
Several methods to identify tagging single-nucleotide polymorphisms (SNPs) are in common use for genetic epidemiologic studies; however, there may be loss of information when using only a subset of SNPs. We sought to compare the ability of commonly used pairwise, multimarker, and haplotype-based tagging SNP selection methods to detect known associations with quantitative expression phenotypes. Using data from HapMap release 21 on unrelated Utah residents with ancestors from northern and western Europe (CEPH-Utah, CEU), we selected tagging SNPs in five chromosomal regions using ldSelect, Tagger, and TagSNPs. We found that SNP subsets did not substantially overlap, and that the use of trio data did not greatly impact SNP selection. We then tested associations between HapMap genotypes and expression phenotypes on 28 CEU individuals as part of Genetic Analysis Workshop 15. Relative to the use of all SNPs (n = 210 SNPs across all regions), most subset methods were able to detect single-SNP and haplotype associations. Generally, pairwise selection approaches worked extremely well, relative to use of all SNPs, with marked reductions in the number of SNPs required. Haplotype-based approaches, which had identified smaller SNP subsets, missed associations in some regions. We conclude that the optimal tagging SNP method depends on the true model of the genetic association (i.e., whether a SNP or haplotype is responsible); unfortunately, this is often unknown at the time of SNP selection. Additional evaluations using empirical and simulated data are needed.
- Published
- 2007
- Full Text
- View/download PDF
25. The genetics of gene expression: comparison of linkage scans using two phenotype normalization methods.
- Author
-
de Andrade M, Atkinson EJ, Fridley BL, Goode EL, McDonnell S, Liu-Mares W, Rabe KG, Sun Z, and Slager SL
- Abstract
The goal of this paper is to investigate the effects of normalization procedures for expression data on linkage results. We selected the two most commonly used expression data extraction and normalization methods, Affymetrix global scaling and dChip invariant. After applying these two methods in 3554 expression phenotypes, we identified 45 phenotypes that were more likely to be genetic for either normalization procedure. A genome-wide linkage scan was performed on these expression values (45 phenotypes x 2 normalizations) using 2272 SNPs. Our results showed that: 1) the dChip normalization might inflate the LOD scores because the dChip normalization yielded LOD scores > 3.0 30% more frequently than the Affy normalization, and 2) the difference in LODs between the normalizations were not correlated with their heritabilities. In summary, we conclude, as have other published reports, that normalization methods play an important role in the linkage results, and that some significant linkage signals might be due to a specific normalization method.
- Published
- 2007
- Full Text
- View/download PDF
26. Can gene expression profiling predict survival for patients with squamous cell carcinoma of the lung?
- Author
-
Sun Z, Yang P, Aubry MC, Kosari F, Endo C, Molina J, and Vasmatzis G
- Subjects
- Carcinoma, Squamous Cell diagnosis, Carcinoma, Squamous Cell genetics, Case-Control Studies, Cluster Analysis, Female, Humans, Lung Neoplasms diagnosis, Lung Neoplasms genetics, Male, Oligonucleotide Array Sequence Analysis, Prognosis, Survival Analysis, Carcinoma, Squamous Cell mortality, Gene Expression Profiling, Lung Neoplasms mortality
- Abstract
Background: Lung cancer remains to be the leading cause of cancer death worldwide. Patients with similar lung cancer may experience quite different clinical outcomes. Reliable molecular prognostic markers are needed to characterize the disparity. In order to identify the genes responsible for the aggressiveness of squamous cell carcinoma of the lung, we applied DNA microarray technology to a case control study. Fifteen patients with surgically treated stage I squamous cell lung cancer were selected. Ten were one-to-one matched on tumour size and grade, age, gender, and smoking status; five died of lung cancer recurrence within 24 months (high-aggressive group), and five survived more than 54 months after surgery (low-aggressive group). Five additional tissues were included as test samples. Unsupervised and supervised approaches were used to explore the relationship among samples and identify differentially expressed genes. We also evaluated the gene markers' accuracy in segregating samples to their respective group. Functional gene networks for the significant genes were retrieved, and their association with survival was tested., Results: Unsupervised clustering did not group tumours based on survival experience. At p < 0.05, 294 and 246 differentially expressed genes for matched and unmatched analysis respectively were identified between the low and high aggressive groups. Linear discriminant analysis was performed on all samples using the 27 top unique genes, and the results showed an overall accuracy rate of 80%. Tests on the association of 24 gene networks with study outcome showed that 7 were highly correlated with the survival time of the lung cancer patients., Conclusion: The overall gene expression pattern between the high and low aggressive squamous cell carcinomas of the lung did not differ significantly with the control of confounding factors. A small subset of genes or genes in specific pathways may be responsible for the aggressive nature of a tumour and could potentially serve as panels of prognostic markers for stage I squamous cell lung cancer.
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.