Author: "Pal, Soumyabrata" / Publication Year Range: Last 10 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Pal, Soumyabrata"' showing total 44 results

Start Over Author "Pal, Soumyabrata" Publication Year Range Last 10 years

44 results on '"Pal, Soumyabrata"'

1. Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD

Author: Das, Aniket, Nagaraj, Dheeraj, Pal, Soumyabrata, Suggala, Arun, and Varshney, Prateek
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We consider the problem of high-dimensional heavy-tailed statistical estimation in the streaming setting, which is much harder than the traditional batch setting due to memory constraints. We cast this problem as stochastic convex optimization with heavy tailed stochastic gradients, and prove that the widely used Clipped-SGD algorithm attains near-optimal sub-Gaussian statistical rates whenever the second moment of the stochastic gradient noise is finite. More precisely, with $T$ samples, we show that Clipped-SGD, for smooth and strongly convex objectives, achieves an error of $\sqrt{\frac{\mathsf{Tr}(\Sigma)+\sqrt{\mathsf{Tr}(\Sigma)\|\Sigma\|_2}\log(\frac{\log(T)}{\delta})}{T}}$ with probability $1-\delta$, where $\Sigma$ is the covariance of the clipped gradient. Note that the fluctuations (depending on $\frac{1}{\delta}$) are of lower order than the term $\mathsf{Tr}(\Sigma)$. This improves upon the current best rate of $\sqrt{\frac{\mathsf{Tr}(\Sigma)\log(\frac{1}{\delta})}{T}}$ for Clipped-SGD, known only for smooth and strongly convex objectives. Our results also extend to smooth convex and lipschitz convex objectives. Key to our result is a novel iterative refinement strategy for martingale concentration, improving upon the PAC-Bayes approach of Catoni and Giulini., Comment: Accepted at NeurIPS 2024
Published: 2024

2. Annotation Efficiency: Identifying Hard Samples via Blocked Sparse Linear Bandits

Author: Jain, Adit, Pal, Soumyabrata, Choudhary, Sunav, Narayanam, Ramasuri, and Krishnamurthy, Vikram
Subjects: Computer Science - Machine Learning
Abstract: This paper considers the problem of annotating datapoints using an expert with only a few annotation rounds in a label-scarce setting. We propose soliciting reliable feedback on difficulty in annotating a datapoint from the expert in addition to ground truth label. Existing literature in active learning or coreset selection turns out to be less relevant to our setting since they presume the existence of a reliable trained model, which is absent in the label-scarce regime. However, the literature on coreset selection emphasizes the presence of difficult data points in the training set to perform supervised learning in downstream tasks (Mindermann et al., 2022). Therefore, for a given fixed annotation budget of $\mathsf{T}$ rounds, we model the sequential decision-making problem of which (difficult) datapoints to choose for annotation in a sparse linear bandits framework with the constraint that no arm can be pulled more than once (blocking constraint). With mild assumptions on the datapoints, our (computationally efficient) Explore-Then-Commit algorithm BSLB achieves a regret guarantee of $\widetilde{\mathsf{O}}(k^{\frac{1}{3}} \mathsf{T}^{\frac{2}{3}} +k^{-\frac{1}{2}} \beta_k + k^{-\frac{1}{12}} \beta_k^{\frac{1}{2}}\mathsf{T}^{\frac{5}{6}})$ where the unknown parameter vector has tail magnitude $\beta_k$ at sparsity level $k$. To this end, we show offline statistical guarantees of Lasso estimator with mild Restricted Eigenvalue (RE) condition that is also robust to sparsity. Finally, we propose a meta-algorithm C-BSLB that does not need knowledge of the optimal sparsity parameters at a no-regret cost. We demonstrate the efficacy of our BSLB algorithm for annotation in the label-scarce setting for an image classification task on the PASCAL-VOC dataset, where we use real-world annotation difficulty scores., Comment: 31 Pages
Published: 2024

3. FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction

Author: Jain, Akriti, Sharma, Saransh, Mukherjee, Koyel, and Pal, Soumyabrata
Subjects: Computer Science - Computation and Language
Abstract: Auto-regressive Large Language Models (LLMs) demonstrate remarkable performance across domanins such as vision and language processing. However, due to sequential processing through a stack of transformer layers, autoregressive decoding faces significant computation/latency challenges, particularly in resource constrained environments like mobile and edge devices. Existing approaches in literature that aim to improve latency via skipping layers have two distinct flavors - 1) Early exit 2) Input-agnostic heuristics where tokens exit at pre-determined layers irrespective of input sequence. Both the above strategies have limitations - the former cannot be applied to handle KV Caching necessary for speed-ups in modern framework and the latter does not capture the variation in layer importance across tasks or more generally, across input sequences. To address both limitations, we propose FIRST, an algorithm that reduces inference latency by using layer-specific routers to select a subset of transformer layers adaptively for each input sequence - the prompt (during prefill stage) decides which layers will be skipped during decoding. FIRST preserves compatibility with KV caching enabling faster inference while being quality-aware. FIRST is model-agnostic and can be easily enabled on any pre-trained LLM. We further improve performance by incorporating LoRA adapters for fine-tuning on external datasets, enhancing task-specific accuracy while maintaining latency benefits. Our approach reveals that input adaptivity is critical - indeed, different task-specific middle layers play a crucial role in evolving hidden representations depending on task. Extensive experiments show that FIRST significantly reduces latency while retaining competitive performance (as compared to baselines), making our approach an efficient solution for LLM deployment in low-resource environments., Comment: 17 pages, 6 figures, Submitted to ICLR 2025
Published: 2024

4. Online Matrix Completion: A Collaborative Approach with Hott Items

Author: Baby, Dheeraj and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Computer Science - Information Retrieval, Statistics - Machine Learning
Abstract: We investigate the low rank matrix completion problem in an online setting with ${M}$ users, ${N}$ items, ${T}$ rounds, and an unknown rank-$r$ reward matrix ${R}\in \mathbb{R}^{{M}\times {N}}$. This problem has been well-studied in the literature and has several applications in practice. In each round, we recommend ${S}$ carefully chosen distinct items to every user and observe noisy rewards. In the regime where ${M},{N} >> {T}$, we propose two distinct computationally efficient algorithms for recommending items to users and analyze them under the benign \emph{hott items} assumption.1) First, for ${S}=1$, under additional incoherence/smoothness assumptions on ${R}$, we propose the phased algorithm \textsc{PhasedClusterElim}. Our algorithm obtains a near-optimal per-user regret of $\tilde{O}({N}{M}^{-1}(\Delta^{-1}+\Delta_{{hott}}^{-2}))$ where $\Delta_{{hott}},\Delta$ are problem-dependent gap parameters with $\Delta_{{hott}} >> \Delta$ almost always. 2) Second, we consider a simplified setting with ${S}=r$ where we make significantly milder assumptions on ${R}$. Here, we introduce another phased algorithm, \textsc{DeterminantElim}, to derive a regret guarantee of $\widetilde{O}({N}{M}^{-1/r}\Delta_{det}^{-1}))$ where $\Delta_{{det}}$ is another problem-dependent gap. Both algorithms crucially use collaboration among users to jointly eliminate sub-optimal items for groups of users successively in phases, but with distinctive and novel approaches., Comment: Appeared at the Forty-first International Conference on Machine Learning, 2024
Published: 2024

5. Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints

Author: Pal, Soumyabrata, Suggala, Arun Sai, Shanmugam, Karthikeyan, and Jain, Prateek
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider the problem of \emph{blocked} collaborative bandits where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. Our goal is to design algorithms that maximize the cumulative reward accrued by all the users over time, under the \emph{constraint} that no arm of a user is pulled more than $\mathsf{B}$ times. This problem has been originally considered by \cite{Bresler:2014}, and designing regret-optimal algorithms for it has since remained an open problem. In this work, we propose an algorithm called \texttt{B-LATTICE} (Blocked Latent bAndiTs via maTrIx ComplEtion) that collaborates across users, while simultaneously satisfying the budget constraints, to maximize their cumulative rewards. Theoretically, under certain reasonable assumptions on the latent structure, with $\mathsf{M}$ users, $\mathsf{N}$ arms, $\mathsf{T}$ rounds per user, and $\mathsf{C}=O(1)$ latent clusters, \texttt{B-LATTICE} achieves a per-user regret of $\widetilde{O}(\sqrt{\mathsf{T}(1 + \mathsf{N}\mathsf{M}^{-1})}$ under a budget constraint of $\mathsf{B}=\Theta(\log \mathsf{T})$. These are the first sub-linear regret bounds for this problem, and match the minimax regret bounds when $\mathsf{B}=\mathsf{T}$. Empirically, we demonstrate that our algorithm has superior performance over baselines even when $\mathsf{B}=1$. \texttt{B-LATTICE} runs in phases where in each phase it clusters users into groups and collaborates across users within a group to quickly learn their reward models., Comment: 44 pages, To Appear in NeurIPS 2023
Published: 2023

6. Optimal Algorithms for Latent Bandits with Cluster Structure

Author: Pal, Soumyabrata, Suggala, Arun Sai, Shanmugam, Karthikeyan, and Jain, Prateek
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal of the users is to maximize their cumulative rewards. This problem is central to practical recommendation systems and has received wide attention of late \cite{gentile2014online, maillard2014latent}. Now, if each user acts independently, then they would have to explore each arm independently and a regret of $\Omega(\sqrt{\mathsf{MNT}})$ is unavoidable, where $\mathsf{M}, \mathsf{N}$ are the number of arms and users, respectively. Instead, we propose LATTICE (Latent bAndiTs via maTrIx ComplEtion) which allows exploitation of the latent cluster structure to provide the minimax optimal regret of $\widetilde{O}(\sqrt{(\mathsf{M}+\mathsf{N})\mathsf{T}})$, when the number of clusters is $\widetilde{O}(1)$. This is the first algorithm to guarantee such strong regret bound. LATTICE is based on a careful exploitation of arm information within a cluster while simultaneously clustering users. Furthermore, it is computationally efficient and requires only $O(\log{\mathsf{T}})$ calls to an offline matrix completion oracle across all $\mathsf{T}$ rounds., Comment: 48 pages. Accepted to AISTATS 2023. Added Experiments
Published: 2023

7. Improved Support Recovery in Universal One-bit Compressed Sensing

Author: Matsumoto, Namiko, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Computer Science - Information Theory, Computer Science - Discrete Mathematics, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: One-bit compressed sensing (1bCS) is an extremely quantized signal acquisition method that has been proposed and studied rigorously in the past decade. In 1bCS, linear samples of a high dimensional signal are quantized to only one bit per sample (sign of the measurement). Assuming the original signal vector to be sparse, existing results in 1bCS either aim to find the support of the vector, or approximate the signal allowing a small error. The focus of this paper is support recovery, which often also computationally facilitate approximate signal recovery. A {\em universal} measurement matrix for 1bCS refers to one set of measurements that work for all sparse signals. With universality, it is known that $\tilde{\Theta}(k^2)$ 1bCS measurements are necessary and sufficient for support recovery (where $k$ denotes the sparsity). To improve the dependence on sparsity from quadratic to linear, in this work we propose approximate support recovery (allowing $\epsilon>0$ proportion of errors), and superset recovery (allowing $\epsilon$ proportion of false positives). We show that the first type of recovery is possible with $\tilde{O}(k/\epsilon)$ measurements, while the later type of recovery, more challenging, is possible with $\tilde{O}(\max\{k/\epsilon,k^{3/2}\})$ measurements. We also show that in both cases $\Omega(k/\epsilon)$ measurements would be necessary for universal recovery. Improved results are possible if we consider universal recovery within a restricted class of signals, such as rational signals, or signals with bounded dynamic range. In both cases superset recovery is possible with only $\tilde{O}(k/\epsilon)$ measurements. Other results on universal but approximate support recovery are also provided in this paper. All of our main recovery algorithms are simple and polynomial-time., Comment: 26 pages, no figures. This paper is an extended and improved version of arXiv:2107.09091 (accepted to ITCS 2022)
Published: 2022

8. Sample-Efficient Personalization: Modeling User Parameters as Low Rank Plus Sparse Components

Author: Pal, Soumyabrata, Varshney, Prateek, Jain, Prateek, Thakurta, Abhradeep Guha, Madan, Gagan, Aggarwal, Gaurav, Shenoy, Pradeep, and Srivastava, Gaurav
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has high storage/infrastructure cost. Moreover, rigorous theoretical studies of scalable personalization approaches have been very limited. To address the above issues, we propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse components. This captures common information from multiple individuals/users together in the low-rank part while sparse part captures user-specific idiosyncrasies. We then study the framework in the linear setting, where the problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column sparse matrix using a small number of linear measurements. We propose a computationally efficient alternating minimization method with iterative hard thresholding -- AMHT-LRS -- to learn the low-rank and sparse part. Theoretically, for the realizable Gaussian data setting, we show that AMHT-LRS solves the problem efficiently with nearly optimal sample complexity. Finally, a significant challenge in personalization is ensuring privacy of each user's sensitive data. We alleviate this problem by proposing a differentially private variant of our method that also is equipped with strong generalization guarantees., Comment: 104 pages, 7 figures, 2 Tables
Published: 2022

9. Online Low Rank Matrix Completion

Author: Jain, Prateek and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the problem of {\em online} low-rank matrix completion with $\mathsf{M}$ users, $\mathsf{N}$ items and $\mathsf{T}$ rounds. In each round, the algorithm recommends one item per user, for which it gets a (noisy) reward sampled from a low-rank user-item preference matrix. The goal is to design a method with sub-linear regret (in $\mathsf{T}$) and nearly optimal dependence on $\mathsf{M}$ and $\mathsf{N}$. The problem can be easily mapped to the standard multi-armed bandit problem where each item is an {\em independent} arm, but that leads to poor regret as the correlation between arms and users is not exploited. On the other hand, exploiting the low-rank structure of reward matrix is challenging due to non-convexity of the low-rank manifold. We first demonstrate that the low-rank structure can be exploited using a simple explore-then-commit (ETC) approach that ensures a regret of $O(\mathsf{polylog} (\mathsf{M}+\mathsf{N}) \mathsf{T}^{2/3})$. That is, roughly only $\mathsf{polylog} (\mathsf{M}+\mathsf{N})$ item recommendations are required per user to get a non-trivial solution. We then improve our result for the rank-$1$ setting which in itself is quite challenging and encapsulates some of the key issues. Here, we propose \textsc{OCTAL} (Online Collaborative filTering using iterAtive user cLustering) that guarantees nearly optimal regret of $O(\mathsf{polylog} (\mathsf{M}+\mathsf{N}) \mathsf{T}^{1/2})$. OCTAL is based on a novel technique of clustering users that allows iterative elimination of items and leads to a nearly optimal minimax rate., Comment: 37 pages, 7 figures (Accepted at ICLR 2023)
Published: 2022

10. Community Recovery in the Geometric Block Model

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: Computer Science - Social and Information Networks, Computer Science - Discrete Mathematics, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model builds on the random geometric graphs (Gilbert, 1961), one of the basic models of random graphs for spatial networks, in the same way that the well-studied stochastic block model builds on the Erd\H{o}s-R\'{en}yi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancements in community detection. To analyze the geometric block model, we first provide new connectivity results for random annulus graphs which are generalizations of random geometric graphs. The connectivity properties of geometric graphs have been studied since their introduction, and analyzing them has been more difficult than their Erd\H{o}s-R\'{en}yi counterparts due to correlated edge formation. We then use the connectivity results of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for the geometric block model. We show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. For this we consider the following two regimes of graph density. In the regime where the average degree of the graph grows logarithmically with the number of vertices, we show that our algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model in the logarithmic degree regime. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm., Comment: 53 pages, 18 figures. Accepted at the Journal of Machine Learning Research (JMLR). Shorter versions accepted in AAAI 2018 (see arXiv:1709.05510) and RANDOM 2019 (see arXiv:1804.05013). arXiv admin note: text overlap with arXiv:1804.05013
Published: 2022

11. On Learning Mixture of Linear Regressions in the Non-Realizable Setting

Author: Ghosh, Avishek, Mazumdar, Arya, Pal, Soumyabrata, and Sen, Rajat
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: While mixture of linear regressions (MLR) is a well-studied topic, prior works usually do not analyze such models for prediction error. In fact, {\em prediction} and {\em loss} are not well-defined in the context of mixtures. In this paper, first we show that MLR can be used for prediction where instead of predicting a label, the model predicts a list of values (also known as {\em list-decoding}). The list size is equal to the number of components in the mixture, and the loss function is defined to be minimum among the losses resulted by all the component models. We show that with this definition, a solution of the empirical risk minimization (ERM) achieves small probability of prediction error. This begs for an algorithm to minimize the empirical risk for MLR, which is known to be computationally hard. Prior algorithmic works in MLR focus on the {\em realizable} setting, i.e., recovery of parameters when data is probabilistically generated by a mixed linear (noisy) model. In this paper we show that a version of the popular alternating minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed, under some regularity conditions on the dataset and the initial points, and thereby provides a solution for the ERM. We further provide an algorithm that runs in polynomial time in the number of datapoints, and recovers a good approximation of the best fit lines. The two algorithms are experimentally compared., Comment: To appear in ICML 2022
Published: 2022

12. Support Recovery in Mixture Models with Sparse Parameters

Author: Mazumdar, Arya and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Mixture models are widely used to fit complex and multimodal datasets. In this paper we study mixtures with high dimensional sparse latent parameter vectors and consider the problem of support recovery of those vectors. While parameter learning in mixture models is well-studied, the sparsity constraint remains relatively unexplored. Sparsity of parameter vectors is a natural constraint in variety of settings, and support recovery is a major step towards parameter estimation. We provide efficient algorithms for support recovery that have a logarithmic sample complexity dependence on the dimensionality of the latent space. Our algorithms are quite general, namely they are applicable to 1) mixtures of many different canonical distributions including Uniform, Poisson, Laplace, Gaussians, etc. 2) Mixtures of linear regressions and linear classifiers with Gaussian covariates under different assumptions on the unknown parameters. In most of these settings, our results are the first guarantees on the problem while in the rest, our results provide improvements on existing works., Comment: 55 pages, Shorter version titled "On Learning Mixture Models with Sparse Parameters " accepted at AISTATS 2022
Published: 2022

13. Random Subgraph Detection Using Queries

Author: Huleihel, Wasim, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Computer Science - Machine Learning, Mathematics - Statistics Theory
Abstract: The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on $n$ vertices. Under the null hypothesis, the graph is a realization of an Erd\H{o}s-R\'{e}nyi graph with edge probability (or, density) $q$. Under the alternative, there is a subgraph on $k$ vertices with edge probability $p>q$. The statistical as well as the computational barriers of this problem are well-understood for a wide range of the edge parameters $p$ and $q$. In this paper, we consider a natural variant of the above problem, where one can only observe a relatively small part of the graph using adaptive edge queries. For this model, we determine the number of queries necessary and sufficient (accompanied with a quasi-polynomial optimal algorithm) for detecting the presence of the planted subgraph. We also propose a polynomial-time algorithm which is able to detect the planted subgraph, albeit with more queries compared to the above lower bound. We conjecture that in the leftover regime, no polynomial-time algorithms exist. Our results resolve two open questions posed in the past literature., Comment: 27 pages
Published: 2021

14. Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians

Author: Davies, Sami, Mazumdar, Arya, Pal, Soumyabrata, and Rashtchian, Cyrus
Subjects: Mathematics - Probability, Computer Science - Information Theory, Computer Science - Machine Learning, Mathematics - Statistics Theory
Abstract: Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory. While the total variation distance appears naturally in the sample complexity of distribution learning, it is analytically difficult to obtain tight lower bounds for mixtures. Exploiting a connection between total variation distance and the characteristic function of the mixture, we provide fairly tight functional approximations. This enables us to derive new lower bounds on the total variation distance between pairs of two-component Gaussian mixtures that have a shared covariance matrix., Comment: 22 pages, 1 figure; Accepted to ALT 2022
Published: 2021

15. Support Recovery in Universal One-bit Compressed Sensing

Author: Mazumdar, Arya and Pal, Soumyabrata
Subjects: Computer Science - Information Theory, Computer Science - Discrete Mathematics, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: One-bit compressed sensing (1bCS) is an extreme-quantized signal acquisition method that has been intermittently studied in the past decade. In 1bCS, linear samples of a high dimensional signal are quantized to only one bit per sample (sign of the measurement). The extreme quantization makes it an interesting case study of the more general single-index or generalized linear models. At the same time it can also be thought of as a `design' version of learning a binary linear classifier or halfspace-learning. Assuming the original signal vector to be sparse, existing results in 1bCS either aim to find the support of the vector, or approximate the signal within an $\epsilon$-ball. The focus of this paper is support recovery, which often also computationally facilitate approximate signal recovery. A \emph{universal} measurement matrix for 1bCS refers to one set of measurements that work \emph{for all} sparse signals. With universality, it is known that $\tilde{\Theta}(k^2)$ 1bCS measurements are necessary and sufficient for support recovery (where $k$ denotes the sparsity). In this work, we show that it is possible to universally recover the support with a small number of false positives with $\tilde{O}(k^{3/2})$ measurements. If the dynamic range of the signal vector is known, then with a different technique, this result can be improved to only $\tilde{O}(k)$ measurements. Other results on universal but approximate support recovery are also provided in this paper. All of our main recovery algorithms are simple and polynomial-time., Comment: 20 pages
Published: 2021

16. Support Recovery of Sparse Signals from a Mixture of Linear Measurements

Author: Gandikota, Venkata, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Recovery of support of a sparse vector from simple measurements is a widely-studied problem, considered under the frameworks of compressed sensing, 1-bit compressed sensing, and more general single index models. We consider generalizations of this problem: mixtures of linear regressions, and mixtures of linear classifiers, where the goal is to recover supports of multiple sparse vectors using only a small number of possibly noisy linear, and 1-bit measurements respectively. The key challenge is that the measurements from different vectors are randomly mixed. Both of these problems have also received attention recently. In mixtures of linear classifiers, the observations correspond to the side of queried hyperplane a random unknown vector lies in, whereas in mixtures of linear regressions we observe the projection of a random unknown vector on the queried hyperplane. The primary step in recovering the unknown vectors from the mixture is to first identify the support of all the individual component vectors. In this work, we study the number of measurements sufficient for recovering the supports of all the component vectors in a mixture in both these models. We provide algorithms that use a number of measurements polynomial in $k, \log n$ and quasi-polynomial in $\ell$, to recover the support of all the $\ell$ unknown vectors in the mixture with high probability when each individual component is a $k$-sparse $n$-dimensional vector., Comment: 27 pages, Accepted in NeurIPS 2021
Published: 2021

17. Fuzzy Clustering with Similarity Queries

Author: Huleihel, Wasim, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning
Abstract: The fuzzy or soft $k$-means objective is a popular generalization of the well-known $k$-means problem, extending the clustering capability of the $k$-means to datasets that are uncertain, vague, and otherwise hard to cluster. In this paper, we propose a semi-supervised active clustering framework, where the learner is allowed to interact with an oracle (domain expert), asking for the similarity between a certain set of chosen items. We study the query and computational complexities of clustering in this framework. We prove that having a few of such similarity queries enables one to get a polynomial-time approximation algorithm to an otherwise conjecturally NP-hard problem. In particular, we provide algorithms for fuzzy clustering in this setting that asks $O(\mathsf{poly}(k)\log n)$ similarity queries and run with polynomial-time-complexity, where $n$ is the number of items. The fuzzy $k$-means objective is nonconvex, with $k$-means as a special case, and is equivalent to some other generic nonconvex problem such as non-negative matrix factorization. The ubiquitous Lloyd-type algorithms (or alternating minimization algorithms) can get stuck at a local minimum. Our results show that by making a few similarity queries, the problem becomes easier to solve. Finally, we test our algorithms over real-world datasets, showing their effectiveness in real-world applications., Comment: 42 pages, 7 figures (Accepted to NeurIPS 2021)
Published: 2021

18. Learning User Preferences in Non-Stationary Environments

Author: Huleihel, Wasim, Pal, Soumyabrata, and Shayevitz, Ofer
Subjects: Computer Science - Machine Learning, Computer Science - Information Retrieval
Abstract: Recommendation systems often use online collaborative filtering (CF) algorithms to identify items a given user likes over time, based on ratings that this user and a large number of other users have provided in the past. This problem has been studied extensively when users' preferences do not change over time (static case); an assumption that is often violated in practical settings. In this paper, we introduce a novel model for online non-stationary recommendation systems which allows for temporal uncertainties in the users' preferences. For this model, we propose a user-based CF algorithm, and provide a theoretical analysis of its achievable reward. Compared to related non-stationary multi-armed bandit literature, the main fundamental difficulty in our model lies in the fact that variations in the preferences of a certain user may affect the recommendations for other users severely. We also test our algorithm over real-world datasets, showing its effectiveness in real-world applications. One of the main surprising observations in our experiments is the fact our algorithm outperforms other static algorithms even when preferences do not change over time. This hints toward the general conclusion that in practice, dynamic algorithms, such as the one we propose, might be beneficial even in stationary environments., Comment: 31 pages, 3 figures
Published: 2021

19. Recovery of sparse linear classifiers from mixture of responses

Author: Gandikota, Venkata, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: In the problem of learning a mixture of linear classifiers, the aim is to learn a collection of hyperplanes from a sequence of binary responses. Each response is a result of querying with a vector and indicates the side of a randomly chosen hyperplane from the collection the query vector belongs to. This model provides a rich representation of heterogeneous data with categorical labels and has only been studied in some special settings. We look at a hitherto unstudied problem of query complexity upper bound of recovering all the hyperplanes, especially for the case when the hyperplanes are sparse. This setting is a natural generalization of the extreme quantization problem known as 1-bit compressed sensing. Suppose we have a set of $\ell$ unknown $k$-sparse vectors. We can query the set with another vector $\boldsymbol{a}$, to obtain the sign of the inner product of $\boldsymbol{a}$ and a randomly chosen vector from the $\ell$-set. How many queries are sufficient to identify all the $\ell$ unknown vectors? This question is significantly more challenging than both the basic 1-bit compressed sensing problem (i.e., $\ell=1$ case) and the analogous regression problem (where the value instead of the sign is provided). We provide rigorous query complexity results (with efficient algorithms) for this problem., Comment: 31 pages, 2 figures (To Appear at NeurIPS 2020)
Published: 2020

20. Recovery of Sparse Signals from a Mixture of Linear Samples

Author: Mazumdar, Arya and Pal, Soumyabrata
Subjects: Statistics - Machine Learning, Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Mixture of linear regressions is a popular learning theoretic model that is used widely to represent heterogeneous data. In the simplest form, this model assumes that the labels are generated from either of two different linear models and mixed together. Recent works of Yin et al. and Krishnamurthy et al., 2019, focus on an experimental design setting of model recovery for this problem. It is assumed that the features can be designed and queried with to obtain their label. When queried, an oracle randomly selects one of the two different sparse linear models and generates a label accordingly. How many such oracle queries are needed to recover both of the models simultaneously? This question can also be thought of as a generalization of the well-known compressed sensing problem (Cand\`es and Tao, 2005, Donoho, 2006). In this work, we address this query complexity problem and provide efficient algorithms that improves on the previously best known results., Comment: International Conference on Machine Learning (ICML), 2020. (26 pages, 3 figures)
Published: 2020

21. Algebraic and Analytic Approaches for Parameter Learning in Mixture Models

Author: Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We present two different approaches for parameter learning in several mixture models in one dimension. Our first approach uses complex-analytic methods and applies to Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and Poisson mixtures, among others. An example result is that $\exp(O(N^{1/3}))$ samples suffice to exactly learn a mixture of $k
Published: 2020

22. Sample Complexity of Learning Mixtures of Sparse Linear Regressions

Author: Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection. This setting is quite expressive and has been studied both in terms of practical applications and for the sake of establishing theoretical guarantees. In this paper, we consider the case where the signal vectors are sparse; this generalizes the popular compressed sensing paradigm. We improve upon the state-of-the-art results as follows: In the noisy case, we resolve an open question of Yin et al. (IEEE Transactions on Information Theory, 2019) by showing how to handle collections of more than two vectors and present the first robust reconstruction algorithm, i.e., if the signals are not perfectly sparse, we still learn a good sparse approximation of the signals. In the noiseless case, as well as in the noisy case, we show how to circumvent the need for a restrictive assumption required in the previous work. Our techniques are quite different from those in the previous work: for the noiseless case, we rely on a property of sparse polynomials and for the noisy case, we provide new connections to learning Gaussian mixtures and use ideas from the theory of error-correcting codes., Comment: NeurIPS 2019
Published: 2019

23. Same-Cluster Querying for Overlapping Clusters

Author: Huleihel, Wasim, Mazumdar, Arya, Médard, Muriel, and Pal, Soumyabrata
Subjects: Computer Science - Machine Learning, Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Overlapping clusters are common in models of many practical data-segmentation applications. Suppose we are given $n$ elements to be clustered into $k$ possibly overlapping clusters, and an oracle that can interactively answer queries of the form "do elements $u$ and $v$ belong to the same cluster?" The goal is to recover the clusters with minimum number of such queries. This problem has been of recent interest for the case of disjoint clusters. In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries. We provide algorithmic results under both arbitrary (worst-case) and statistical modeling assumptions. Our algorithms are parameter free, efficient, and work in the presence of random noise. We also derive information-theoretic lower bounds on the number of queries needed, proving that our algorithms are order optimal. Finally, we test our algorithms over both synthetic and real-world data, showing their practicality and effectiveness., Comment: 43 pages, accepted at NeurIPS'19
Published: 2019

24. Trace Reconstruction: Generalized and Parameterized

Author: Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, and Pal, Soumyabrata
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Information Theory
Abstract: In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string $x$ given random "traces" of $x$ where each trace is generated by deleting each coordinate of $x$ independently with probability $p<1$. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. We prove that $\exp(O(n^{1/4} \sqrt{\log n}))$ traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown $\sqrt{n}\times \sqrt{n}$ matrix is deleted independently with probability $p$. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is $\exp(O(n^{1/3}))$. An optimal result for random matrix reconstruction: we show that $\Theta(\log n)$ traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is $\exp({O}(\log^{1/3} n))$. We show that $\exp(O(k^{1/3}\log^{2/3} n))$ traces suffice to reconstruct $k$-sparse strings, providing an improvement over the best known sequence reconstruction results when $k = o(n/\log^2 n)$. We show that $\textrm{poly}(n)$ traces suffice if $x$ is $k$-sparse and we additionally have a "separation" promise, specifically that the indices of 1's in $x$ all differ by $\Omega(k \log n)$.
Published: 2019

25. Semisupervised Clustering by Queries and Locally Encodable Source Coding

Author: Mazumdar, Arya and Pal, Soumyabrata
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Source coding is the canonical problem of data compression in information theory. In a locally encodable source coding, each compressed bit depends on only few bits of the input. In this paper, we show that a recently popular model of semi-supervised clustering is equivalent to locally encodable source coding. In this model, the task is to perform multiclass labeling of unlabeled elements. At the beginning, we can ask in parallel a set of simple queries to an oracle who provides (possibly erroneous) binary answers to the queries. The queries cannot involve more than two (or a fixed constant number of) elements. Now the labeling of all the elements (or clustering) must be performed based on the noisy query answers. The goal is to recover all the correct labelings while minimizing the number of such queries. The equivalence to locally encodable source codes leads us to find lower bounds on the number of queries required in a variety of scenarios. We provide querying schemes based on pairwise `same cluster' queries - and pairwise AND queries and show provable performance guarantees for each of the schemes., Comment: 16 pages, 11 figures. Some of the results of this paper have appeared in the proceedings of the 2017 Conference on Neural Information Processing Systems (NeurIPS 2017)
Published: 2019

26. High Dimensional Discrete Integration over the Hypergrid

Author: Maity, Raj Kumar, Mazumdar, Arya, and Pal, Soumyabrata
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Computational Complexity, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Recently Ermon et al. (2013) pioneered a way to practically compute approximations to large scale counting or discrete integration problems by using random hashes. The hashes are used to reduce the counting problem into many separate discrete optimization problems. The optimization problems then can be solved by an NP-oracle such as commercial SAT solvers or integer linear programming (ILP) solvers. In particular, Ermon et al. showed that if the domain of integration is $\{0,1\}^n$ then it is possible to obtain a solution within a factor of $16$ of the optimal (a 16-approximation) by this technique. In many crucial counting tasks, such as computation of partition function of ferromagnetic Potts model, the domain of integration is naturally $\{0,1,\dots, q-1\}^n, q>2$, the hypergrid. The straightforward extension of Ermon et al.'s method allows a $q^2$-approximation for this problem. For large values of $q$, this is undesirable. In this paper, we show an improved technique to obtain an approximation factor of $4+O(1/q^2)$ to this problem. We are able to achieve this by using an idea of optimization over multiple bins of the hash functions, that can be easily implemented by inequality constraints, or even in unconstrained way. Also the burden on the NP-oracle is not increased by our method (an ILP solver can still be used). We provide experimental simulation results to support the theoretical guarantees of our algorithms.
Published: 2018

27. Connectivity in Random Annulus Graphs and the Geometric Block Model

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: Computer Science - Discrete Mathematics, Computer Science - Data Structures and Algorithms, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: We provide new connectivity results for {\em vertex-random graphs} or {\em random annulus graphs} which are significant generalizations of random geometric graphs. Random geometric graphs (RGG) are one of the most basic models of random graphs for spatial networks proposed by Gilbert in 1961, shortly after the introduction of the Erd\H{o}s-R\'{en}yi random graphs. They resemble social networks in many ways (e.g. by spontaneously creating cluster of nodes with high modularity). The connectivity properties of RGG have been studied since its introduction, and analyzing them has been significantly harder than their Erd\H{o}s-R\'{en}yi counterparts due to correlated edge formation. Our next contribution is in using the connectivity of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for {\em the geometric block model} (GBM). The GBM is a probabilistic model for community detection defined over an RGG in a similar spirit as the popular {\em stochastic block model}, which is defined over an Erd\H{o}s-R\'{en}yi random graph. The geometric block model inherits the transitivity properties of RGGs and thus models communities better than a stochastic block model. However, analyzing them requires fresh perspectives as all prior tools fail due to correlation in edge formation. We provide a simple and efficient algorithm that can recover communities in GBM exactly with high probability in the regime of connectivity.
Published: 2018

28. The Geometric Block Model

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: Computer Science - Social and Information Networks, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning, E.1
Abstract: To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdos-Renyi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancement in community detection. While being a topic of fundamental theoretical interest, our main contribution is to show that many practical community structures are better explained by the geometric block model. We also show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. Indeed, even in the regime where the average degree of the graph grows only logarithmically with the number of vertices (sparse-graph), we show that this algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm., Comment: A shorter version of this paper has appeared in 32nd AAAI Conference on Artificial Intelligence. The AAAI proceedings version as well as the previous version in arxiv contained some errors that have been corrected in this version
Published: 2017

29. Polynomials and Second Order Linear Recurrences

Author: Pal, Soumyabrata and Venkatesan, Shankar M.
Subjects: Mathematics - Combinatorics
Abstract: One of the most interesting results of the last century was the proof completed by Matijasevich that computably enumerable sets are precisely the diophantine sets [MRDP Theorem, 9], thus settling, based on previously developed machinery, Hilbert's question whether there exists a general algorithm for checking the solvability in integers of any diophantine equation. In this paper we describe techniques to prove the nonexistence of polynomials in two variables for some simple generalizations of the Fibonacci sequence (explicit diophantine representation of Fibonacci numbers were known from Jones' polynomial whose positive values have the same range as that of Fibonacci numbers), and we believe similar techniques exist for the primes. In this paper we mainly show the following results: (1) using one of the many techniques known for solving the Pell's equation, namely the solution in an extended number system, we prove the existence and explicitly find the polynomials for the recurrences of the form $e(n)=ae(n-1)+e(n-2)$ with starting values of 0 and 1 in particular, and for any arbitrary starting values, in the process defining a concept of fundamental starting numbers, (2) we prove a few identities that seem to be quite interesting and useful, (3) we use these identities in a novel way to generate systems of equations of certain rank deficiency using which we disprove for the first time the existence of any polynomial in 2 variables for the generalized recurrence of the form $e(n)=ae(n-1)+be(n-2)$
Published: 2016

30. Prime Power Divisibility,Periodicity and Other Properties of Some Second Order Recurrences

Author: Pal, Soumyabrata and Venkatesan, Shankar M.
Subjects: Mathematics - Combinatorics
Abstract: Wall published a paper in 1960 on the Fibonacci sequence where he derived many results concerning the period and prime power divisibility modulo m. His periodicity results have been generalized to second order linear recurrences. Here we study the sequences generated by such recurrences, with starting values of {0,1}: among other things, we derive new prime power divisibility results, derive the period by new methods, establish new identities, show derivations involving powers of matrices generated by these general recurrences, etc., Comment: 11 pages
Published: 2015

31. Improved Support Recovery in Universal 1-bit Compressed Sensing

Author: Matsumoto, Namiko, Mazumdar, Arya, and Pal, Soumyabrata
Abstract: One-bit compressed sensing (1bCS) is an extremely quantized signal acquisition method that has been proposed and studied rigorously in the past decade. In 1bCS, linear samples of a high dimensional signal are quantized to only one bit per sample (sign of the measurement). The extreme quantization makes it an interesting case study of the more general single-index or generalized linear models. At the same time it can also be thought of as a ‘design’ version of learning a binary linear classifier or halfspace-learning. Assuming the original signal vector to be sparse, existing results in 1bCS either aim to find the support of the vector, or approximate the signal allowing a small error. The focus of this paper is support recovery, which often also computationally facilitate approximate signal recovery. A universal measurement matrix for 1bCS refers to one set of measurements that work for all sparse signals. With universality, it is known that $\tilde {\Theta }(k^{2})~1$ bCS measurements are necessary and sufficient for support recovery (where $k$ denotes the sparsity). To improve the dependence on sparsity from quadratic to linear, in this work we propose approximate support recovery (allowing $\epsilon >0$ proportion of errors), and superset recovery (allowing $\epsilon $ proportion of false positives). We show that the first type of recovery is possible with $\tilde {O}(k/\epsilon )$ measurements, while the later type of recovery, more challenging, is possible with $\tilde {O}(\max \{k/\epsilon ,k^{3/2}\}) ^{^{^{^{}}}}$ measurements. We also show that in both cases $\Omega (k/\epsilon )$ measurements would be necessary for universal recovery. Improved results are possible if we consider universal recovery within a restricted class of signals, such as rational signals, or signals with bounded dynamic range. In both cases superset recovery is possible with only $\tilde {O}(k/\epsilon )$ measurements. Other results on universal but approximate support recovery are also provided in this paper. All of our main recovery algorithms are simple and polynomial-time.
Published: 2024
Full Text: View/download PDF

32. Improved Support Recovery in Universal One-bit Compressed Sensing

Author: Matsumoto, Namiko, primary, Mazumdar, Arya, additional, and Pal, Soumyabrata, additional
Published: 2023
Full Text: View/download PDF

33. Large-scale Model Personalization via Low Rank and Sparse decomposition

Author: Pal, Soumyabrata, Varshney, Prateek, Jain, Prateek, Thakurta, Abhradeep Guha, Madan, Gagan, Aggarwal, Gaurav, Shenoy, Pradeep, and Srivastava, Gaurav
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Cryptography and Security, Optimization and Control (math.OC), Statistics - Machine Learning, FOS: Mathematics, Machine Learning (stat.ML), Cryptography and Security (cs.CR), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation style systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has high storage/infrastructure cost. We propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse matrices. This captures common information from multiple individuals/users together in the low-rank part while sparse part captures user-specific idiosyncrasies. Furthermore, the framework is up to two orders of magnitude more scalable (in terms of storage/infrastructure cost) than user-specific finetuning of model. We then study the framework in the linear setting, where the problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column sparse matrix using a small number of linear measurements. We propose an alternating minimization method with iterative hard thresholding -- AMHT-LRS -- to learn the low-rank and sparse part. For the realizable, Gaussian data setting, we show that AMHT-LRS solves the problem efficiently with nearly optimal samples. A significant challenge in personalization is ensuring privacy of each user's sensitive data. We alleviate this problem by proposing a differentially private variant of our method that also is equipped with strong generalization guarantees. Finally, on multiple standard recommendation datasets, we demonstrate that our approach allows personalized models to obtain superior performance in sparse data regime., 100 pages, 8 figures, 2 Tables
Published: 2022

34. Support Recovery in Universal One-Bit Compressed Sensing

Author: Mazumdar, Arya, Pal, Soumyabrata, Mazumdar, Arya, and Pal, Soumyabrata
Published: 2022
Full Text: View/download PDF

35. Mixture Models in Machine Learning

Author: Pal, Soumyabrata
Abstract: Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings. In this thesis, we look at three groups of problems. The first part is aimed at estimating the parameters of a mixture of simple distributions. We ask the following question: How many samples are necessary and sufficient to learn the latent parameters? We propose several approaches for this problem that include complex analytic tools to connect statistical distances between pairs of mixtures with the characteristic function. We show sufficient sample complexity guarantees for mixtures of popular distributions (including Gaussian, Poisson and Geometric). For many distributions, our results provide the first sample complexity guarantees for parameter estimation in the corresponding mixture. Using these techniques, we also provide improved lower bounds on the Total Variation distance between Gaussian mixtures with two components and demonstrate new results in some sequence reconstruction problems. In the second part, we study Mixtures of Sparse Linear Regressions where the goal is to learn the best set of linear relationships between the scalar responses (i.e., labels) and the explanatory variables (i.e., features). We focus on a scenario where a learner is able to choose the features to get the labels. To tackle the high dimensionality of data, we further assume that the linear maps are also "sparse", i.e., have only few prominent features among many. For this setting, we devise algorithms with sub-linear (as a function of the dimension) sample complexity guarantees that are also robust to noise. In the final part, we study Mixtures of Sparse Linear Classifiers in the same setting as above. Given a set of features and the binary labels, the objective of this task is to find a set of hyperplanes in the space of features such that for any (feature, label) pair, there exists a hyperplane in the set that justifies the mapping. We devise efficient algorithms with sub-linear sample complexity guarantees for learning the unknown hyperplanes under similar sparsity assumptions as above. To that end, we propose several novel techniques that include tensor decomposition methods and combinatorial designs.
Published: 2022
Full Text: View/download PDF

36. Trace Reconstruction: Generalized and Parameterized

Author: Krishnamurthy, Akshay, primary, Mazumdar, Arya, additional, McGregor, Andrew, additional, and Pal, Soumyabrata, additional
Published: 2021
Full Text: View/download PDF

37. Semisupervised Clustering by Queries and Locally Encodable Source Coding

Author: Mazumdar, Arya, primary and Pal, Soumyabrata, additional
Published: 2021
Full Text: View/download PDF

38. Search Result Diversification with Guarantee of Topic Proportionality

Author: Sarwar, Sheikh Muhammad, primary, Addanki, Raghavendra, additional, Montazeralghaem, Ali, additional, Pal, Soumyabrata, additional, and Allan, James, additional
Published: 2020
Full Text: View/download PDF

39. Connectivity of Random Annulus Graphs and the Geometric Block Model

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: 000 Computer science, knowledge, general works, Computer Science
Abstract: Random geometric graph (Gilbert, 1961) is a basic model of random graphs for spatial networks proposed shortly after the introduction of the Erdös-Rényi random graphs. The geometric block model (GBM) is a probabilistic model for community detection defined over random geometric graphs (RGG) similar in spirit to the popular stochastic block model which is defined over Erdös-Rényi random graphs. The GBM naturally inherits many desirable properties of RGGs such as transitivity ("friends having common friends') and has been shown to model many real-world networks better than the stochastic block model. Analyzing the properties of a GBM requires new tools and perspectives to handle correlation in edge formation. In this paper, we study the necessary and sufficient conditions for community recovery over GBM in the connectivity regime. We provide efficient algorithms that recover the communities exactly with high probability and match the lower bound within a small constant factor. This requires us to prove new connectivity results for vertex-random graphs or random annulus graphs which are natural generalizations of random geometric graphs. A vertex-random graph is a model of random graphs where the randomness lies in the vertices as opposed to an Erdös-Rényi random graph where the randomness lies in the edges. A vertex-random graph G(n, [r_1, r_2]), 0
Published: 2019
Full Text: View/download PDF

40. Trace Reconstruction: Generalized and Parameterized

Author: Akshay Krishnamurthy and Arya Mazumdar and Andrew McGregor and Soumyabrata Pal, Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, Pal, Soumyabrata, Akshay Krishnamurthy and Arya Mazumdar and Andrew McGregor and Soumyabrata Pal, Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, and Pal, Soumyabrata
Abstract: In the beautifully simple-to-state problem of trace reconstruction, the goal is to reconstruct an unknown binary string x given random "traces" of x where each trace is generated by deleting each coordinate of x independently with probability p<1. The problem is well studied both when the unknown string is arbitrary and when it is chosen uniformly at random. For both settings, there is still an exponential gap between upper and lower sample complexity bounds and our understanding of the problem is still surprisingly limited. In this paper, we consider natural parameterizations and generalizations of this problem in an effort to attain a deeper and more comprehensive understanding. Perhaps our most surprising results are: 1) We prove that exp(O(n^(1/4) sqrt{log n})) traces suffice for reconstructing arbitrary matrices. In the matrix version of the problem, each row and column of an unknown sqrt{n} x sqrt{n} matrix is deleted independently with probability p. Our results contrasts with the best known results for sequence reconstruction where the best known upper bound is exp(O(n^(1/3))). 2) An optimal result for random matrix reconstruction: we show that Theta(log n) traces are necessary and sufficient. This is in contrast to the problem for random sequences where there is a super-logarithmic lower bound and the best known upper bound is exp({O}(log^(1/3) n)). 3) We show that exp(O(k^(1/3) log^(2/3) n)) traces suffice to reconstruct k-sparse strings, providing an improvement over the best known sequence reconstruction results when k = o(n/log^2 n). 4) We show that poly(n) traces suffice if x is k-sparse and we additionally have a "separation" promise, specifically that the indices of 1’s in x all differ by Omega(k log n).
Published: 2019
Full Text: View/download PDF

41. Connectivity of Random Annulus Graphs and the Geometric Block Model

Author: Sainyam Galhotra and Arya Mazumdar and Soumyabrata Pal and Barna Saha, Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, Saha, Barna, Sainyam Galhotra and Arya Mazumdar and Soumyabrata Pal and Barna Saha, Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Abstract: Random geometric graph (Gilbert, 1961) is a basic model of random graphs for spatial networks proposed shortly after the introduction of the Erdős-Rényi random graphs. The geometric block model (GBM) is a probabilistic model for community detection defined over random geometric graphs (RGG) similar in spirit to the popular stochastic block model which is defined over Erdős-Rényi random graphs. The GBM naturally inherits many desirable properties of RGGs such as transitivity ("friends having common friends') and has been shown to model many real-world networks better than the stochastic block model. Analyzing the properties of a GBM requires new tools and perspectives to handle correlation in edge formation. In this paper, we study the necessary and sufficient conditions for community recovery over GBM in the connectivity regime. We provide efficient algorithms that recover the communities exactly with high probability and match the lower bound within a small constant factor. This requires us to prove new connectivity results for vertex-random graphs or random annulus graphs which are natural generalizations of random geometric graphs. A vertex-random graph is a model of random graphs where the randomness lies in the vertices as opposed to an Erdős-Rényi random graph where the randomness lies in the edges. A vertex-random graph G(n, [r_1, r_2]), 0 <=r_1
Published: 2019
Full Text: View/download PDF

42. The Geometric Block Model

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: Social and Information Networks (cs.SI), FOS: Computer and information sciences, Statistics - Machine Learning, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), Machine Learning (stat.ML), Computer Science - Social and Information Networks, General Medicine, E.1
Abstract: To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdos-Renyi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancement in community detection. While being a topic of fundamental theoretical interest, our main contribution is to show that many practical community structures are better explained by the geometric block model. We also show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. Indeed, even in the regime where the average degree of the graph grows only logarithmically with the number of vertices (sparse-graph), we show that this algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm., A shorter version of this paper has appeared in 32nd AAAI Conference on Artificial Intelligence. The AAAI proceedings version as well as the previous version in arxiv contained some errors that have been corrected in this version
Published: 2018

43. The Geometric Block Model and Applications

Author: Galhotra, Sainyam, primary, Pal, Soumyabrata, additional, Mazumdar, Arya, additional, and Saha, Barna, additional
Published: 2018
Full Text: View/download PDF

44. Community Recovery in the Geometric Block Model.

Author: Galhotra, Sainyam, Mazumdar, Arya, Pal, Soumyabrata, and Saha, Barna
Subjects: *GEOMETRIC modeling, *RANDOM graphs, *TRIANGLES, *STOCHASTIC models, *SOCIAL problems
Abstract: To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model builds on the random geometric graphs (Gilbert, 1961), one of the basic models of random graphs for spatial networks, in the same way that the well-studied stochastic block model builds on the Erdõs-Rényi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancements in community detection. To analyze the geometric block model, we first provide new connectivity results for random annulus graphs which are generalizations of random geometric graphs. The connectivity properties of geometric graphs have been studied since their introduction, and analyzing them has been more difficult than their Erdõs-Rényi counterparts due to correlated edge formation. We then use the connectivity results of random annulus graphs to provide necessary and sufficient conditions for efficient recovery of communities for the geometric block model. We show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. For this we consider the following two regimes of graph density. In the regime where the average degree of the graph grows logarithmically with the number of vertices, we show that our algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model in the logarithmic degree regime. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm. [ABSTRACT FROM AUTHOR]
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

44 results on '"Pal, Soumyabrata"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources