Back to Search Start Over

Comparative Analysis of Transformation Methods for Gene Expression Profiles in Breast Cancer Datasets

Authors :
Shigeto Seno
Shinzaburo Noguchi
Hideo Matsuda
Yoshiaki Sota
Yoichi Takenaka
Source :
BIBE
Publication Year :
2016
Publisher :
IEEE, 2016.

Abstract

Gene expression profiling has been increasingly used in clinical practice. Integration of expression data across multiple experiments provides better insight into the heterogeneity of the biology being examined. A problem of the data integration, an experimental batch from platform or laboratory sources, remains a barrier to systematically analyzing data across different datasets. Several methods (such as, ComBat) have been proposed to remove batch effects. However, these methods often make assumptions about ideal distribution of the underlying data. Difficulties might be expected when comparing datasets that have fundamentally different (dataset-dependent) distributions. For example, clinical datasets are often collected from patient samples with various disease stages or conditions. Therefore, we have compared several mathematical transformations across many datasets, including the nonparametric Z scaling transformation method (NPZ) we have proposed for clinical use. We selected 2,813 patients with available information on estrogen receptor (ER) status or human epidermal growth factor receptor 2 (HER2) status from 24 Affymetrix HG-U133 (GPL96) or Affymetrix HG-U133 plus 2.0 (GPL570) datasets in the Gene Expression Omnibus database. The microarray expression data were processed with one of the four following methods: Raw (background correction and log transformation only), Microarray Suite 5.0 (MAS5), frozen robust multiarray analysis (fRMA), and radius minimax (RMX). The normalized data were sequentially transformed by using one of the following five methods: untransformed (without transformation), single-array-based transformations (RANK, Z, NPZ, or YuGene). Finally, we compared the ER and HER2 statuses assessed by immunohistochemical (IHC) staining with mRNA expression. We found that single-array-based transformation in addition to normalization improved the concordance rates of the IHC staining. We demonstrated the influence of transformation by using breast cancer samples and showed that adding single-array-based transformations to microarray expression data resulted in stronger correlations with IHC staining.

Details

Database :
OpenAIRE
Journal :
2016 IEEE 16th International Conference on Bioinformatics and Bioengineering (BIBE)
Accession number :
edsair.doi...........bdded895ff0f8d36825e457c309f92d2
Full Text :
https://doi.org/10.1109/bibe.2016.51