Back to Search
Start Over
Evaluating reliability of tree-patterns in extreme-K categorical samples problems.
- Source :
- Journal of Statistical Computation & Simulation; Dec 2021, Vol. 91 Issue 18, p3828-3849, 22p
- Publication Year :
- 2021
-
Abstract
- Exploratory Data Analysis (EDA) approaches are adopted to address the difficult extreme-K categorical sample problem. Due to observed data's categorical nature, all comparisons among populations are performed by comparing their distributions in the form of a histogram with symbolic bins. A distance measure is designed to evaluate the discrepancy between two symbol-based histograms to facilitate Hierarchical Clustering (HC) algorithms. The resultant binary HC-tree then serves as the basis for our EDA task of discovering tree-patterns of interest. Since each population-leaf's location within a binary HC-tree's geometry is expressed through a binary code sequence, a binary code segment characterizes all commonly shared tree-patterns for all members. We then generate a large ensemble of mimicries of the observed dataset based on multinomial distributions and construct a large ensemble of binary HC-trees. Upon each identified tree-pattern which we determined based on the observed dataset, we evaluate its reliability and uncertainty through two histograms. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 00949655
- Volume :
- 91
- Issue :
- 18
- Database :
- Complementary Index
- Journal :
- Journal of Statistical Computation & Simulation
- Publication Type :
- Academic Journal
- Accession number :
- 153842749
- Full Text :
- https://doi.org/10.1080/00949655.2021.1951266