Author: "She, Yiyuan" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"She, Yiyuan"' showing total 7 results

Start Over Author "She, Yiyuan" Database OpenAIRE

7 results on '"She, Yiyuan"'

1. Supervised Multivariate Learning with Simultaneous Feature Auto-grouping and Dimension Reduction

Author: She, Yiyuan, Shen, Jiahui, and Zhang, Chao
Subjects: Statistics and Probability, Methodology (stat.ME), FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, FOS: Mathematics, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Statistics, Probability and Uncertainty, Statistics - Methodology, Machine Learning (cs.LG)
Abstract: Modern high-dimensional methods often adopt the ‘bet on sparsity’ principle, while in supervised multivariate learning statisticians may face ‘dense’ problems with a large number of nonzero coefficients. This paper proposes a novel clustered reduced-rank learning (CRL) framework that imposes two joint matrix regularizations to automatically group the features in constructing predictive factors. CRL is more interpretable than low-rank modelling and relaxes the stringent sparsity assumption in variable selection. In this paper, new information-theoretical limits are presented to reveal the intrinsic cost of seeking for clusters, as well as the blessing from dimensionality in multivariate learning. Moreover, an efficient optimization algorithm is developed, which performs subspace learning and clustering with guaranteed convergence. The obtained fixed-point estimators, although not necessarily globally optimal, enjoy the desired statistical accuracy beyond the standard likelihood setup under some regularity conditions. Moreover, a new kind of information criterion, as well as its scale-free form, is proposed for cluster and rank selection, and has a rigorous theoretical support without assuming an infinite sample size. Extensive simulations and real-data experiments demonstrate the statistical accuracy and interpretability of the proposed method.
Published: 2021
Full Text: View/download PDF

2. Analysis of Generalized Bregman Surrogate Algorithms for Nonsmooth Nonconvex Statistical Learning

Author: She, Yiyuan, Wang, Zhifeng, and Jin, Jiuwu
Subjects: Statistics and Probability, FOS: Computer and information sciences, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Statistics, Probability and Uncertainty, Mathematics - Optimization and Control, Statistics - Computation, Computation (stat.CO)
Abstract: Modern statistical applications often involve minimizing an objective function that may be nonsmooth and/or nonconvex. This paper focuses on a broad Bregman-surrogate algorithm framework including the local linear approximation, mirror descent, iterative thresholding, DC programming and many others as particular instances. The recharacterization via generalized Bregman functions enables us to construct suitable error measures and establish global convergence rates for nonconvex and nonsmooth objectives in possibly high dimensions. For sparse learning problems with a composite objective, under some regularity conditions, the obtained estimators as the surrogate's fixed points, though not necessarily local minimizers, enjoy provable statistical guarantees, and the sequence of iterates can be shown to approach the statistical truth within the desired accuracy geometrically fast. The paper also studies how to design adaptive momentum based accelerations without assuming convexity or smoothness by carefully controlling stepsize and relaxation parameters.
Published: 2021
Full Text: View/download PDF

3. Robust Reduced Rank Regression

Author: She, Yiyuan and Chen, Kun
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, FOS: Mathematics, Mathematics - Statistics Theory, Applications (stat.AP), Statistics Theory (math.ST), Statistics - Applications, Statistics - Methodology
Abstract: In high-dimensional multivariate regression problems, enforcing low rank in the coefficient matrix offers effective dimension reduction, which greatly facilitates parameter estimation and model interpretation. However, commonly-used reduced-rank methods are sensitive to data corruption, as the low-rank dependence structure between response variables and predictors is easily distorted by outliers. We propose a robust reduced-rank regression approach for joint modeling and outlier detection. The problem is formulated as a regularized multivariate regression with a sparse mean-shift parametrization, which generalizes and unifies some popular robust multivariate methods. An efficient thresholding-based iterative procedure is developed for optimization. We show that the algorithm is guaranteed to converge, and the coordinatewise minimum point produced is statistically accurate under regularity conditions. Our theoretical investigations focus on nonasymptotic robust analysis, which demonstrates that joint rank reduction and outlier detection leads to improved prediction accuracy. In particular, we show that redescending $\psi$-functions can essentially attain the minimax optimal error rate, and in some less challenging problems convex regularization guarantees the same low error rate. The performance of the proposed method is examined by simulation studies and real data examples.
Published: 2015
Full Text: View/download PDF

4. On the Finite-Sample Analysis of $��$-estimators

Author: She, Yiyuan
Subjects: FOS: Mathematics, Statistics Theory (math.ST)
Abstract: In large-scale modern data analysis, first-order optimization methods are usually favored to obtain sparse estimators in high dimensions. This paper performs theoretical analysis of a class of iterative thresholding based estimators defined in this way. Oracle inequalities are built to show the nearly minimax rate optimality of such estimators under a new type of regularity conditions. Moreover, the sequence of iterates is found to be able to approach the statistical truth within the best statistical accuracy geometrically fast. Our results also reveal different benefits brought by convex and nonconvex types of shrinkage.
Published: 2015
Full Text: View/download PDF

5. Approximating Higher-Order Distances Using Random Projections

Author: Li, Ping, Mahoney, Michael W., and She, Yiyuan
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We provide a simple method and relevant theoretical analysis for efficiently estimating higher-order lp distances. While the analysis mainly focuses on l4, our methodology extends naturally to p = 6,8,10..., (i.e., when p is even). Distance-based methods are popular in machine learning. In large-scale applications, storing, computing, and retrieving the distances can be both space and time prohibitive. Efficient algorithms exist for estimating lp distances if 0 < p 2 is known to be difficult. Our work partially fills this gap., Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)
Published: 2012

6. Sparse regression with exact clustering

Author: She, Yiyuan
Subjects: Statistics and Probability, Statistics::Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, 62J07, MathematicsofComputing_NUMERICALANALYSIS, thresholding, Statistics, Probability and Uncertainty, lasso, Sparsity, 62H30, clustering
Abstract: This paper studies a generic sparse regression problem with a customizable sparsity pattern matrix, motivated by, but not limited to, a supervised gene clustering problem in microarray data analysis. The clustered lasso method is proposed with the l1-type penalties imposed on both the coefficients and their pairwise differences. Somewhat surprisingly, it behaves differently than the lasso or the fused lasso – the exact clustering effect expected from the l1 penalization is rarely seen in applications. An asymptotic study is performed to investigate the power and limitations of the l1-penalty in sparse regression. We propose to combine data-augmentation and weights to improve the l1 technique. To address the computational issues in high dimensions, we successfully generalize a popular iterative algorithm both in practice and in theory and propose an ‘annealing’ algorithm applicable to generic sparse regressions (including the fused/clustered lasso). Some effective accelerating techniques are further investigated to boost the convergence. The accelerated annealing (AA) algorithm, involving only matrix multiplications and thresholdings, can handle a large design matrix as well as a large sparsity pattern matrix.
Published: 2010

7. Block TERM factorization of block matrices

Author: Hao Peng-wei and She Yiyuan
Subjects: Algebra, Discrete mathematics, Integer matrix, Unimodular matrix, General Computer Science, Matrix function, ComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATION, Prime factor, Block matrix, Quadratic sieve, Mathematics, Matrix decomposition, Integer (computer science)
Abstract: Reversible integer mapping (or integer transform) is a useful way to realize lossless coding, and this technique has been used for multi-component image compression in the new international image compression standard JPEG 2000. For any nonsingular linear transform of finite dimension, its integer transform can be implemented by factorizing the transform matrix into 3 triangular elementary reversible matrices (TERMs) or a series of single-row elementary reversible matrices (SERMs). To speed up and parallelize integer transforms, we study block TERM and SERM factorizations in this paper. First, to guarantee flexible scaling manners, the classical determinant (det) is generalized to a matrix function, DET , which is shown to have many important properties analogous to those of det . Then based on DET , a generic block TERM factorization, BLUS , is presented for any nonsingular block matrix. Our conclusions can cover the early optimal point factorizations and provide an efficient way to implement integer transforms for large matrices.
Published: 2004

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

7 results on '"She, Yiyuan"'

1. Supervised Multivariate Learning with Simultaneous Feature Auto-grouping and Dimension Reduction

2. Analysis of Generalized Bregman Surrogate Algorithms for Nonsmooth Nonconvex Statistical Learning

3. Robust Reduced Rank Regression

4. On the Finite-Sample Analysis of $��$-estimators

5. Approximating Higher-Order Distances Using Random Projections

6. Sparse regression with exact clustering

7. Block TERM factorization of block matrices

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

7 results on '"She, Yiyuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources