Author: "Mozharovskyi, Pavlo" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Mozharovskyi, Pavlo"' showing total 32 results

Start Over Author "Mozharovskyi, Pavlo" Language english

32 results on '"Mozharovskyi, Pavlo"'

1. Functional anomaly detection: a benchmark study

Author: Staerman, Guillaume, Adjakossa, Eric, Mozharovskyi, Pavlo, Hofer, Vera, Sen Gupta, Jayant, and Clémençon, Stephan
Published: 2023
Full Text: View/download PDF

2. Approximate computation of projection depths

Author: Dyckerhoff, Rainer, Mozharovskyi, Pavlo, and Nagy, Stanislav
Published: 2021
Full Text: View/download PDF

3. On Exact Computation of Tukey Depth Central Regions.

Author: Fojtík, Vít, Laketa, Petra, Mozharovskyi, Pavlo, and Nagy, Stanislav
Subjects: POINT set theory, ALGORITHMS, COMPUTATIONAL geometry, C++, QUANTILES, K-means clustering
Abstract: The Tukey (or halfspace) depth extends nonparametric methods toward multivariate data. The multivariate analogues of the quantiles are the central regions of the Tukey depth, defined as sets of points in the d-dimensional space whose Tukey depth exceeds given thresholds k. We address the problem of fast and exact computation of those central regions. First, we analyze an efficient Algorithm (A) from Liu, Mosler, and Mozharovskyi, and prove that it yields exact results in dimension d = 2, or for a low threshold k in arbitrary dimension. We provide examples where Algorithm (A) fails to recover the exact Tukey depth region for d > 2, and propose a modification that is guaranteed to be exact. We express the problem of computing the exact central region in its dual formulation, and use that viewpoint to demonstrate that further substantial improvements to our algorithm are unlikely. An efficient C++ implementation of our exact algorithm is freely available in the R package TukeyRegion. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Exact computation of the halfspace depth

Author: Dyckerhoff, Rainer and Mozharovskyi, Pavlo
Published: 2016
Full Text: View/download PDF

5. Statistical Process Monitoring of Artificial Neural Networks.

Author: Malinovskaya, Anna, Mozharovskyi, Pavlo, and Otto, Philipp
Subjects: *ARTIFICIAL neural networks, *QUALITY control charts, *ARTIFICIAL intelligence, *SUPERVISED learning, *MACHINE learning
Abstract: The rapid advancement of models based on artificial intelligence demands innovative monitoring techniques which can operate in real time with low computational costs. In machine learning, especially if we consider artificial neural networks (ANNs), the models are often trained in a supervised manner. Consequently, the learned relationship between the input and the output must remain valid during the model's deployment. If this stationarity assumption holds, we can conclude that the ANN provides accurate predictions. Otherwise, the retraining or rebuilding of the model is required. We propose considering the latent feature representation of the data (called "embedding") generated by the ANN to determine the time when the data stream starts being nonstationary. In particular, we monitor embeddings by applying multivariate control charts based on the data depth calculation and normalized ranks. The performance of the introduced method is compared with benchmark approaches for various ANN architectures and different underlying data formats. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. On exact computation of Tukey depth central regions

Author: Fojtík, Vít, Laketa, Petra, Mozharovskyi, Pavlo, and Nagy, Stanislav
Subjects: FOS: Computer and information sciences, 62-08, 62H12, 62G05, Statistics - Computation, Computation (stat.CO)
Abstract: The Tukey (or halfspace) depth extends nonparametric methods toward multivariate data. The multivariate analogues of the quantiles are the central regions of the Tukey depth, defined as sets of points in the $d$-dimensional space whose Tukey depth exceeds given thresholds $k$. We address the problem of fast and exact computation of those central regions. First, we analyse an efficient Algorithm A from Liu et al. (2019), and prove that it yields exact results in dimension $d=2$, or for a low threshold $k$ in arbitrary dimension. We provide examples where Algorithm A fails to recover the exact Tukey depth region for $d>2$, and propose a modification that is guaranteed to be exact. We express the problem of computing the exact central region in its dual formulation, and use that viewpoint to demonstrate that further substantial improvements to our algorithm are unlikely. An efficient C++ implementation of our exact algorithm is freely available in the R package TukeyRegion.
Published: 2022

7. Classifying real-world data with the D D α -procedure

Author: Mozharovskyi, Pavlo, Mosler, Karl, and Lange, Tatjana
Published: 2015
Full Text: View/download PDF

8. Statistical Depth Functions for Ranking Distributions: Definitions, Statistical Learning and Applications

Author: Goibert, Morgane, Cl��men��on, St��phan, Irurozki, Ekhine, and Mozharovskyi, Pavlo
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: The concept of median/consensus has been widely investigated in order to provide a statistical summary of ranking data, i.e. realizations of a random permutation $\Sigma$ of a finite set, $\{1,\; \ldots,\; n\}$ with $n\geq 1$ say. As it sheds light onto only one aspect of $\Sigma$'s distribution $P$, it may neglect other informative features. It is the purpose of this paper to define analogs of quantiles, ranks and statistical procedures based on such quantities for the analysis of ranking data by means of a metric-based notion of depth function on the symmetric group. Overcoming the absence of vector space structure on $\mathfrak{S}_n$, the latter defines a center-outward ordering of the permutations in the support of $P$ and extends the classic metric-based formulation of consensus ranking (medians corresponding then to the deepest permutations). The axiomatic properties that ranking depths should ideally possess are listed, while computational and generalization issues are studied at length. Beyond the theoretical analysis carried out, the relevance of the novel concepts and methods introduced for a wide variety of statistical tasks are also supported by numerous numerical experiments.
Published: 2022

9. A Framework to Learn with Interpretation

Author: Parekh, Jayneel, Mozharovskyi, Pavlo, d'Alché-Buc, Florence, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Images, Données, Signal (IDS), Télécom ParisTech, Institut Polytechnique de Paris (IP Paris), paper funded by DSAIDIS, paper funded by DSAIDIS chair, ANR-20-CE23-0028,LIMPID,Exploitation de machines interprétables pour l'amélioration des performances et la prise de décision(2020), Parekh, Jayneel, and Exploitation de machines interprétables pour l'amélioration des performances et la prise de décision - - LIMPID2020 - ANR-20-CE23-0028 - AAPG2020 - VALID
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), [INFO]Computer Science [cs], [INFO] Computer Science [cs], Machine Learning (cs.LG)
Abstract: International audience; To tackle interpretability in deep learning, we present a novel framework to jointly learn a predictive model and its associated interpretation model. The interpreter provides both local and global interpretability about the predictive model in terms of human-understandable high level attribute functions, with minimal loss of accuracy. This is achieved by a dedicated architecture and well chosen regularization penalties. We seek for a small-size dictionary of high level attribute functions that take as inputs the outputs of selected hidden layers and whose outputs feed a linear classifier. We impose strong conciseness on the activation of attributes with an entropy-based criterion while enforcing fidelity to both inputs and outputs of the predictive model. A detailed pipeline to visualize the learnt features is also developed. Moreover, besides generating interpretable models by design, our approach can be specialized to provide post-hoc interpretations for a pre-trained neural network. We validate our approach against several state-of-the-art methods on multiple datasets and show its efficacy on both kinds of tasks.
Published: 2021

10. Fast nonparametric classification based on data depth

Author: Lange, Tatjana, Mosler, Karl, and Mozharovskyi, Pavlo
Published: 2014
Full Text: View/download PDF

11. Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties and Finite Sample Analysis

Author: Staerman, Guillaume, Mozharovskyi, Pavlo, and Cl��men��on, St��phan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Because it determines a center-outward ordering of observations in $\mathbb{R}^d$ with $d\geq 2$, the concept of statistical depth permits to define quantiles and ranks for multivariate data and use them for various statistical tasks (e.g. inference, hypothesis testing). Whereas many depth functions have been proposed \textit{ad-hoc} in the literature since the seminal contribution of \cite{Tukey75}, not all of them possess the properties desirable to emulate the notion of quantile function for univariate probability distributions. In this paper, we propose an extension of the \textit{integrated rank-weighted} statistical depth (IRW depth in abbreviated form) originally introduced in \cite{IRW}, modified in order to satisfy the property of \textit{affine-invariance}, fulfilling thus all the four key axioms listed in the nomenclature elaborated by \cite{ZuoS00a}. The variant we propose, referred to as the Affine-Invariant IRW depth (AI-IRW in short), involves the covariance/precision matrices of the (supposedly square integrable) $d$-dimensional random vector $X$ under study, in order to take into account the directions along which $X$ is most variable to assign a depth value to any point $x\in \mathbb{R}^d$. The accuracy of the sampling version of the AI-IRW depth is investigated from a nonasymptotic perspective. Namely, a concentration result for the statistical counterpart of the AI-IRW depth is proved. Beyond the theoretical analysis carried out, applications to anomaly detection are considered and numerical results are displayed, providing strong empirical evidence of the relevance of the depth function we propose here.
Published: 2021

12. Youthful and age-related matreotypes predict drugs promoting longevity

Author: Statzer, Cyril, Jongsma, Elisabeth, Liu, Sean X., Dakhovnik, Alexander, Wandrey, Franziska, Mozharovskyi, Pavlo, Zülli, Fred, and Ewald, Collin Y.
Subjects: Pharmacology, Matrisome, Aging, CMap, Longevity, Drug repurposing, Collagen, Extracellular matrix, GTEx, Geroprotector
Abstract: The identification and validation of drugs that promote health during aging (‘geroprotectors’) is key to the retardation or prevention of chronic age-related diseases. Here we found that most of the established pro-longevity compounds shown to extend lifespan in model organisms also alter extracellular matrix gene expression (i.e., matrisome) in human cell lines. To harness this novel observation, we used age-stratified human transcriptomes to define the age-related matreotype, which represents the matrisome gene expression pattern associated with age. Using a ‘youthful’ matreotype, we screened in silico for geroprotective drug candidates. To validate drug candidates, we developed a novel tool using prolonged collagen expression as a non-invasive and in-vivo surrogate marker for C. elegans longevity. With this reporter, we were able to eliminate false positive drug candidates and determine the appropriate dose for extending the lifespan of C. elegans. We improved drug uptake for one of our predicted compounds, genistein, and reconciled previous contradictory reports of its effects on longevity. We identified and validated new compounds, tretinoin, chondroitin sulfate, and hyaluronic acid, for their ability to restore age-related decline of collagen homeostasis and increase lifespan. Thus, our innovative drug screening approach - employing extracellular matrix homeostasis - facilitates the discovery of pharmacological interventions promoting healthy aging., bioRxiv
Published: 2021
Full Text: View/download PDF

13. When OT meets MoM: Robust estimation of Wasserstein Distance

Author: Staerman, Guillaume, Laforgue, Pierre, Mozharovskyi, Pavlo, d'Alch��-Buc, Florence, Département Images, Données, Signal (IDS), Télécom ParisTech, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Télécom Paris, and Staerman, Guillaume
Subjects: FOS: Computer and information sciences, [STAT]Statistics [stat], Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), [INFO]Computer Science [cs], [MATH] Mathematics [math], [INFO] Computer Science [cs], [MATH]Mathematics [math], Machine Learning (cs.LG), [STAT] Statistics [stat]
Abstract: International audience; Issued from Optimal Transport, the Wasserstein distance has gained importance in Machine Learning due to its appealing geometrical properties and the increasing availability of efficient approximations. In this work, we consider the problem of estimating the Wasserstein distance between two probability distributions when observations are polluted by outliers. To that end, we investigate how to leverage Medians of Means (MoM) estimators to robustify the estimation of Wasserstein distance. Exploiting the dual Kantorovitch formulation of Wasserstein distance, we introduce and discuss novel MoM-based robust estimators whose consistency is studied under a data contamination model and for which convergence rates are provided. These MoM estimators enable to make Wasserstein Generative Adversarial Network (WGAN) robust to outliers, as witnessed by an empirical study on two benchmarks CIFAR10 and Fashion MNIST. Eventually, we discuss how to combine MoM with the entropy-regularized approximation of the Wasserstein distance and propose a simple MoM-based re-weighting scheme that could be used in conjunction with the Sinkhorn algorithm.
Published: 2020

14. The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure

Author: Staerman, Guillaume, Mozharovskyi, Pavlo, Clémençon, Stéphan, Département Images, Données, Signal (IDS), Télécom ParisTech, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, and Institut Polytechnique de Paris (IP Paris)
Subjects: [STAT]Statistics [stat], [INFO]Computer Science [cs], [MATH]Mathematics [math], ComputingMilieux_MISCELLANEOUS
Abstract: International audience
Published: 2020

15. Choosing among notions of multivariate depth statistics

Author: Mosler, Karl and Mozharovskyi, Pavlo
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, Statistics and Probability, General Mathematics, Primary 62H05, 62H30, secondary 62-07, Statistics, Probability and Uncertainty, Statistics - Methodology
Abstract: Classical multivariate statistics measures the outlyingness of a point by its Mahalanobis distance from the mean, which is based on the mean and the covariance matrix of the data. A multivariate depth function is a function which, given a point and a distribution in d-space, measures centrality by a number between 0 and 1, while satisfying certain postulates regarding invariance, monotonicity, convexity and continuity. Accordingly, numerous notions of multivariate depth have been proposed in the literature, some of which are also robust against extremely outlying data. The departure from classical Mahalanobis distance does not come without cost. There is a trade-off between invariance, robustness and computational feasibility. In the last few years, efficient exact algorithms as well as approximate ones have been constructed and made available in R-packages. Consequently, in practical applications the choice of a depth statistic is no more restricted to one or two notions due to computational limits; rather often more notions are feasible, among which the researcher has to decide. The article debates theoretical and practical aspects of this choice, including invariance and uniqueness, robustness and computational feasibility. Complexity and speed of exact algorithms are compared. The accuracy of approximate approaches like the random Tukey depth is discussed as well as the application to large and high-dimensional data. Extensions to local and functional depths and connections to regression depth are shortly addressed.
Published: 2020

16. Functional Isolation Forest

Author: Staerman, Guillaume, Mozharovskyi, Pavlo, Clémençon, Stephan, d'Alché-Buc, Florence, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Images, Données, Signal (IDS), Télécom ParisTech, and Institut Polytechnique de Paris (IP Paris)
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], Statistics - Machine Learning, isolation forest, Machine Learning (stat.ML), Anomaly detection, unsupervised learning, Machine Learning (cs.LG), functional data analysis
Abstract: International audience; For the purpose of monitoring the behavior of complex infrastructures (e.g. aircrafts, transport or energy networks), high-rate sensors are deployed to capture multivariate data, generally unlabeled, in quasi continuous-time to detect quickly the occurrence of anomalies that may jeopardize the smooth operation of the system of interest. The statistical analysis of such massive data of functional nature raises many challenging methodological questions. The primary goal of this paper is to extend the popular Isolation Forest (IF) approach to Anomaly Detection, originally dedicated to finite dimensional observations, to functional data. The major difficulty lies in the wide variety of topological structures that may equip a space of functions and the great variety of patterns that may characterize abnormal curves. We address the issue of (randomly) splitting the functional space in a flexible manner in order to isolate progressively any trajectory from the others, a key ingredient to the efficiency of the algorithm. Beyond a detailed description of the algorithm, computational complexity and stability issues are investigated at length. From the scoring function measuring the degree of abnormality of an observation provided by the proposed variant of the IF algorithm, a Functional Statistical Depth function is defined and discussed as well as a multivariate functional extension. Numerical experiments provide strong empirical evidence of the accuracy of the extension proposed.
Published: 2019

17. Depth for Curve Data and Applications.

Author: de Micheaux, Pierre Lafaye, Mozharovskyi, Pavlo, and Vimond, Myriam
Subjects: *DIFFUSION tensor imaging, *STATISTICS, *PROBABILITY measures, *COMPUTER-assisted image analysis (Medicine), *HANDWRITING recognition (Computer science), *NONPARAMETRIC statistics
Abstract: In 1975, John W. Tukey defined statistical data depth as a function that determines the centrality of an arbitrary point with respect to a data cloud or to a probability measure. During the last decades, this seminal idea of data depth evolved into a powerful tool proving to be useful in various fields of science. Recently, extending the notion of data depth to the functional setting attracted a lot of attention among theoretical and applied statisticians. We go further and suggest a notion of data depth suitable for data represented as curves, or trajectories, which is independent of the parameterization. We show that our curve depth satisfies theoretical requirements of general depth functions that are meaningful for trajectories. We apply our methodology to diffusion tensor brain images and also to pattern recognition of handwritten digits and letters. for this article are available online. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

18. Composite marginal likelihood estimation of spatial autoregressive probit models feasible in very large samples

Author: Mozharovskyi, Pavlo and Vogler, Jan
Published: 2016
Full Text: View/download PDF

19. Tukey depth: linear programming and applications

Author: Mozharovskyi, Pavlo, Institut de Recherche Mathématique de Rennes ( IRMAR ), Université de Rennes 1 ( UR1 ), Université de Rennes ( UNIV-RENNES ) -Université de Rennes ( UNIV-RENNES ) -AGROCAMPUS OUEST-École normale supérieure - Rennes ( ENS Rennes ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Institut National des Sciences Appliquées ( INSA ) -Université de Rennes 2 ( UR2 ), Université de Rennes ( UNIV-RENNES ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Mathématiques Appliquées Agrocampus ( LMA2 ), AGROCAMPUS OUEST, Institut de Recherche Mathématique de Rennes (IRMAR), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Université de Rennes 2 (UR2), Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Laboratoire de Mathématiques Appliquées Agrocampus (LMA2), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro), Guillemer, Marie-Annick, Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-École normale supérieure - Rennes (ENS Rennes)-Université de Rennes 2 (UR2)-Centre National de la Recherche Scientifique (CNRS)-INSTITUT AGRO Agrocampus Ouest
Subjects: FOS: Computer and information sciences, Breadthfirst search algorithm, [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST], cone segmentation, exact computation, Tukey depth, linear programming, simplex algorithm, [ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST], [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST], Statistics - Computation, Computation (stat.CO)
Abstract: Determining the representativeness of a point within a data cloud has recently become a desirable task in multivariate analysis. The concept of statistical depth function, which reflects centrality of an arbitrary point, appears to be useful and has been studied intensively during the last decades. Here the issue of exact computation of the classical Tukey data depth is addressed. The paper suggests an algorithm that exploits connection between the Tukey depth and linear separability and is based on iterative application of linear programming. The algorithm further develops the idea of the cone segmentation of the Euclidean space and allows for efficient implementation due to the special search structure. The presentation is complemented by relationship to similar concepts and examples of application.
Published: 2016

20. Classifying real-world data with the DDalpha-procedure

Author: Mozharovskyi, Pavlo, Mosler, Karl, Lange, Tatjana, Universität zu Köln, and Hochschule Merseburg
Subjects: [STAT.AP]Statistics [stat]/Applications [stat.AP], ComputingMilieux_MISCELLANEOUS
Abstract: International audience
Published: 2015
Full Text: View/download PDF

21. Nonparametric Imputation by Data Depth.

Author: Mozharovskyi, Pavlo, Josse, Julie, and Husson, François
Subjects: *DATA distribution, *FORECASTING, *DATA
Abstract: We present single imputation method for missing values which borrows the idea of data depth—a measure of centrality defined for an arbitrary point of a space with respect to a probability distribution or data cloud. This consists in iterative maximization of the depth of each observation with missing values, and can be employed with any properly defined statistical depth function. For each single iteration, imputation reverts to optimization of quadratic, linear, or quasiconcave functions that are solved analytically by linear programming or the Nelder–Mead method. As it accounts for the underlying data topology, the procedure is distribution free, allows imputation close to the data geometry, can make prediction in situations where local imputation (k-nearest neighbors, random forest) cannot, and has attractive robustness and asymptotic properties under elliptical symmetry. It is shown that a special case—when using the Mahalanobis depth—has direct connection to well-known methods for the multivariate normal model, such as iterated regression and regularized PCA. The methodology is extended to multiple imputation for data stemming from an elliptically symmetric distribution. Simulation and real data studies show good results compared with existing popular alternatives. The method has been implemented as an R-package. for the article are available online. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

22. Statistical inference for the Russell measure of technical efficiency.

Author: Badunenko, Oleg and Mozharovskyi, Pavlo
Subjects: INFERENTIAL statistics, DATA envelopment analysis
Abstract: Data envelopment analysis (DEA) has become a popular approach to nonparametric efficiency measurement. The statistical inference using bootstrap methods is readily available for the radial DEA estimator; however it is missing for the Russell measure, the nonradial DEA estimator. We propose a bootstrap based procedure for making statistical inference about the individual Russell measures of technical efficiency. We perform simulations to examine finite sample properties of the proposed estimator. Finally, we present an empirical study using proposed bootstrap procedure. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

23. Fast Computation of Tukey Trimmed Regions and Median in Dimension p > 2.

Author: Liu, Xiaohui, Mosler, Karl, and Mozharovskyi, Pavlo
Subjects: DIMENSIONS, COMPUTATIONAL geometry, MULTIVARIATE analysis, CENTROID, POINT set theory, ALGORITHMS
Abstract: Given data in , a Tukey κ-trimmed region is the set of all points that have at least Tukey depth κ w.r.t. the data. As they are visual, affine equivariant and robust, Tukey regions are useful tools in nonparametric multivariate analysis. While these regions are easily defined and interpreted, their practical use in applications has been impeded so far by the lack of efficient computational procedures in dimension p > 2. We construct two novel algorithms to compute a Tukey κ-trimmed region, a naïve one and a more sophisticated one that is much faster than known algorithms. Further, a strict bound on the number of facets of a Tukey region is derived. In a large simulation study the novel fast algorithm is compared with the naïve one, which is slower and by construction exact, yielding in every case the same correct results. Finally, the approach is extended to an algorithm that calculates the innermost Tukey region and its barycenter, the Tukey median. for this article are available online. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

24. Fast computation of Tukey trimmed regions and median in dimension $p>2$

Author: Liu, Xiaohui, Mosler, Karl, and Mozharovskyi, Pavlo
Subjects: Statistics::Theory, Mathematics::Logic, 62F10, 62F35, Statistics::Methodology, Mathematics::General Topology, Statistics - Computation, Statistics::Computation
Abstract: Given data in $\mathbb{R}^{p}$, a Tukey $\kappa$-trimmed region is the set of all points that have at least Tukey depth $\kappa$ w.r.t. the data. As they are visual, affine equivariant and robust, Tukey regions are useful tools in nonparametric multivariate analysis. While these regions are easily defined and interpreted, their practical use in applications has been impeded so far by the lack of efficient computational procedures in dimension $p > 2$. We construct two novel algorithms to compute a Tukey $\kappa$-trimmed region, a na\"{i}ve one and a more sophisticated one that is much faster than known algorithms. Further, a strict bound on the number of facets of a Tukey region is derived. In a large simulation study the novel fast algorithm is compared with the na\"{i}ve one, which is slower and by construction exact, yielding in every case the same correct results. Finally, the approach is extended to an algorithm that calculates the innermost Tukey region and its barycenter, the Tukey median.
Published: 2014

25. Classifying real-world data with the $DD\alpha$-procedure

Author: Mozharovskyi, Pavlo, Mosler, Karl, and Lange, Tatjana
Subjects: Statistics - Applications, Statistics - Methodology
Abstract: The $DD\alpha$-classifier, a nonparametric fast and very robust procedure, is described and applied to fifty classification problems regarding a broad spectrum of real-world data. The procedure first transforms the data from their original property space into a depth space, which is a low-dimensional unit cube, and then separates them by a projective invariant procedure, called $\alpha$-procedure. To each data point the transformation assigns its depth values with respect to the given classes. Several alternative depth notions (spatial depth, Mahalanobis depth, projection depth, and Tukey depth, the latter two being approximated by univariate projections) are used in the procedure, and compared regarding their average error rates. With the Tukey depth, which fits the distributions' shape best and is most robust, `outsiders', that is data points having zero depth in all classes, need an additional treatment for classification. Evidence is also given about the dimension of the extended feature space needed for linear separation. The $DD\alpha$-procedure is available as an R-package.
Published: 2014

26. Fast nonparametric classification based on data depth

Author: Lange, Tatjana, Mosler, Karl, and Mozharovskyi, Pavlo
Subjects: Nichtparametrisches Verfahren, Alpha-procedure, pattern recognition, ddc:330, DD-plot, misclassification rate, zonoid depth, Clusteranalyse, supervised learning, Theorie
Abstract: A new procedure, called DD-procedure, is developed to solve the problem of classifying d-dimensional objects into q Ï 2 classes. The procedure is completely nonparametric; it uses q-dimensional depth plots and a very efficient algorithm for discrimination analysis in the depth space [0, 1]q . Specifically, the depth is the zonoid depth, and the algorithm is the procedure. In case of more than two classes several binary classifications are performed and a majority rule is applied. Special treatments are discussed for outsiders, that is, data having zero depth vector. The DD-classifier is applied to simulated as well as real data, and the results are compared with those of similar procedures that have been recently proposed. In most cases the new procedure has comparable error rates, but is much faster than other classification approaches, including the SVM.
Published: 2012

27. Fast DD-classification of functional data.

Author: Mosler, Karl and Mozharovskyi, Pavlo
Subjects: SUPERVISED learning, BLOWING up (Algebraic geometry), NONPARAMETRIC statistics, SIMULATION methods & models, BAYES' estimation, MATHEMATICAL models
Abstract: A fast nonparametric procedure for classifying functional data is introduced. It consists of a two-step transformation of the original data plus a classifier operating on a low-dimensional space. The functional data are first mapped into a finite-dimensional location-slope space and then transformed by a multivariate depth function into the DD-plot, which is a subset of the unit square. This transformation yields a new notion of depth for functional data. Three alternative depth functions are employed for this, as well as two rules for the final classification in $$[0,1]^2$$ . The resulting classifier has to be cross-validated over a small range of parameters only, which is restricted by a Vapnik-Chervonenkis bound. The entire methodology does not involve smoothing techniques, is completely nonparametric and allows to achieve Bayes optimality under standard distributional settings. It is robust, efficiently computable, and has been implemented in an R environment. Applicability of the new approach is demonstrated by simulations as well as by a benchmark study. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

28. Nonparametric frontier analysis using Stata.

Author: Badunenko, Oleg and Mozharovskyi, Pavlo
Subjects: *STOCHASTIC frontier analysis, *INFERENTIAL statistics, *LINEAR programming
Abstract: In this article, we describe five new Stata commands that fit and provide statistical inference in nonparametric frontier models. The tenonradial and teradial commands fit data envelopment models where nonradial and radial technical efficiency measures are computed (Färe, 1998, Fundamentals of Production Theory; Färe and Lovell, 1978, Journal of Economic Theory 19: 150-162; Färe, Grosskopf, and Lovell, 1994a, Production Frontiers). Technical efficiency measures are obtained by solving linear programming problems. The teradialbc, nptestind, and nptestrts commands provide tools for making statistical inference regarding radial technical efficiency measures (Simar and Wilson, 1998, Management Science 44: 49-61; 2000, Journal of Applied Statistics 27: 779-802; 2002, European Journal of Operational Research 139: 115-132). We provide a brief overview of the nonparametric efficiency measurement, and we describe the syntax and options of the new commands. Additionally, we provide an example showing the capabilities of the new commands. Finally, we perform a small empirical study of productivity growth. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

29. The Alpha-Procedure: A Nonparametric Invariant Method for Automatic Classification of Multi-Dimensional Objects.

Author: Lange, Tatjana and Mozharovskyi, Pavlo
Published: 2014
Full Text: View/download PDF

30. DDα-Classification of Asymmetric and Fat-Tailed Data.

Author: Lange, Tatjana, Mosler, Karl, and Mozharovskyi, Pavlo
Published: 2014
Full Text: View/download PDF

31. Classifying real-world data with the $${ DD}\alpha $$ -procedure.

Author: Mozharovskyi, Pavlo, Mosler, Karl, and Lange, Tatjana
Abstract: The $${ DD}\alpha $$ -classifier, a nonparametric fast and very robust procedure, is described and applied to fifty classification problems regarding a broad spectrum of real-world data. The procedure first transforms the data from their original property space into a depth space, which is a low-dimensional unit cube, and then separates them by a projective invariant procedure, called $$\alpha $$ -procedure. To each data point the transformation assigns its depth values with respect to the given classes. Several alternative depth notions (spatial depth, Mahalanobis depth, projection depth, and Tukey depth, the latter two being approximated by univariate projections) are used in the procedure, and compared regarding their average error rates. With the Tukey depth, which fits the distributions' shape best and is most robust, 'outsiders', that is data points having zero depth in all classes, appear. They need an additional treatment for classification. Evidence is also given about the dimension of the extended feature space needed for linear separation. The $${ DD}\alpha $$ -procedure is available as an R-package. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

32. The alpha-procedure: a nonparametric invariant method for automatic classification of multi-dimensional objects

Author: Tatjana Lange, Pavlo Mozharovskyi, Hochschule Merseburg, Institut Polytechnique de Paris (IP Paris), Département Images, Données, Signal (IDS), Télécom ParisTech, Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, and Mozharovskyi, Pavlo
Subjects: Multivariate statistics, 021103 operations research, [STAT.ME] Statistics [stat]/Methodology [stat.ME], Computer science, business.industry, Feature vector, 0211 other engineering and technologies, Nonparametric statistics, Pattern recognition, 02 engineering and technology, Linear discriminant analysis, 01 natural sciences, Linear subspace, 010104 statistics & probability, Hyperplane, Multi dimensional, Artificial intelligence, 0101 mathematics, Invariant (mathematics), business, [STAT.ME]Statistics [stat]/Methodology [stat.ME], ComputingMilieux_MISCELLANEOUS
Abstract: A procedure, called α-procedure, for the efficient automatic classification of multivariate data is described. It is based on a geometric representation of two learning classes in a proper multi-dimensional rectifying feature space and the stepwise construction of a separating hyperplane in that space. The dimension of the space, i.e. the number of features that is necessary for a successful classification, is determined step by step using two-dimensional reperes (linear subspaces). In each step a repere and a feature are constructed in a way that they yield maximum discriminating power. Throughout the procedure the invariant, which is the object’s affiliation with a class, is preserved.
Published: 2012

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

32 results on '"Mozharovskyi, Pavlo"'

1. Functional anomaly detection: a benchmark study

2. Approximate computation of projection depths

3. On Exact Computation of Tukey Depth Central Regions.

4. Exact computation of the halfspace depth

5. Statistical Process Monitoring of Artificial Neural Networks.

6. On exact computation of Tukey depth central regions

7. Classifying real-world data with the D D α -procedure

8. Statistical Depth Functions for Ranking Distributions: Definitions, Statistical Learning and Applications

9. A Framework to Learn with Interpretation

10. Fast nonparametric classification based on data depth

11. Affine-Invariant Integrated Rank-Weighted Depth: Definition, Properties and Finite Sample Analysis

12. Youthful and age-related matreotypes predict drugs promoting longevity

13. When OT meets MoM: Robust estimation of Wasserstein Distance

14. The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure

15. Choosing among notions of multivariate depth statistics

16. Functional Isolation Forest

17. Depth for Curve Data and Applications.

18. Composite marginal likelihood estimation of spatial autoregressive probit models feasible in very large samples

19. Tukey depth: linear programming and applications

20. Classifying real-world data with the DDalpha-procedure

21. Nonparametric Imputation by Data Depth.

22. Statistical inference for the Russell measure of technical efficiency.

23. Fast Computation of Tukey Trimmed Regions and Median in Dimension p > 2.

24. Fast computation of Tukey trimmed regions and median in dimension $p>2$

25. Classifying real-world data with the $DD\alpha$-procedure

26. Fast nonparametric classification based on data depth

27. Fast DD-classification of functional data.

28. Nonparametric frontier analysis using Stata.

29. The Alpha-Procedure: A Nonparametric Invariant Method for Automatic Classification of Multi-Dimensional Objects.

30. DDα-Classification of Asymmetric and Fat-Tailed Data.

31. Classifying real-world data with the $${ DD}\alpha $$ -procedure.

32. The alpha-procedure: a nonparametric invariant method for automatic classification of multi-dimensional objects

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Database

Publisher

32 results on '"Mozharovskyi, Pavlo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources