323 results on '"Mathieu, Claire"'
Search Results
2. A Detailed Analysis of the SpaceSaving$\pm$ Family of Algorithms with Bounded Deletions
- Author
-
Zhao, Fuheng, Agrawal, Divyakant, Abbadi, Amr El, Mathieu, Claire, Metwally, Ahmed, and de Rougemont, Michel
- Subjects
Computer Science - Databases ,Computer Science - Data Structures and Algorithms - Abstract
In this paper, we present an advanced analysis of near optimal deterministic algorithms using a small space budget to solve the frequency estimation, heavy hitters, frequent items, and top-k approximation in the bounded deletion model. We define the family of SpaceSaving$\pm$ algorithms and explain why the original SpaceSaving$\pm$ algorithm only works when insertions and deletions are not interleaved. Next, we introduce the new DoubleSpaceSaving$\pm$ and the IntegratedSpaceSaving$\pm$ and prove their correctness. They show similar characteristics and both extend the popular space-efficient SpaceSaving algorithm. However, these two algorithms represent different trade-offs, in which DoubleSpaceSaving$\pm$ distributes the operations to two independent summaries while Integrated-SpaceSaving$\pm$ fully synchronizes deletions with insertions. Since data streams are often skewed, we present an improved analysis of these two algorithms and show that errors do not depend on the hot items and are only dependent on the cold and warm items. We also demonstrate how to achieve the relative error guarantee under mild assumptions. Moreover, we establish that the important mergeability property exists on these two algorithms which is desirable in distributed settings.
- Published
- 2023
3. Testing frequency distributions in a stream
- Author
-
Mathieu, Claire and de Rougemont, Michel
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We study how to verify specific frequency distributions when we observe a stream of $N$ data items taken from a universe of $n$ distinct items. We introduce the \emph{relative Fr\'echet distance} to compare two frequency functions in a homogeneous manner. We consider two streaming models: insertions only and sliding windows. We present a Tester for a certain class of functions, which decides if $f $ is close to $g$ or if $f$ is far from $g$ with high probability, when $f$ is given and $g$ is defined by a stream. If $f$ is uniform we show a space $\Omega(n)$ lower bound. If $f$ decreases fast enough, we then only use space $O(\log^2 n\cdot \log\log n)$. The analysis relies on the Spacesaving algorithm \cite{MAE2005,Z22} and on sampling the stream., Comment: 28 pages, 4 figures
- Published
- 2023
4. Apportionment with parity constraints
- Author
-
Mathieu, Claire and Verdugo, Victor
- Published
- 2024
- Full Text
- View/download PDF
5. Approximating Maximum Integral Multiflows on Bounded Genus Graphs
- Author
-
Huang, Chien-Chung, Mari, Mathieu, Mathieu, Claire, and Vygen, Jens
- Published
- 2023
- Full Text
- View/download PDF
6. An Approximation Algorithm for Distance-Constrained Vehicle Routing on Trees
- Author
-
Dufay, Marc, Mathieu, Claire, and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the Distance-constrained Vehicle Routing Problem (DVRP), we are given a graph with integer edge weights, a depot, a set of $n$ terminals, and a distance constraint $D$. The goal is to find a minimum number of tours starting and ending at the depot such that those tours together cover all the terminals and the length of each tour is at most $D$. The DVRP on trees is of independent interest, because it is equivalent to the virtual machine packing problem on trees studied by Sindelar et al. [SPAA'11]. We design a simple and natural approximation algorithm for the tree DVRP, parameterized by $\varepsilon >0$. We show that its approximation ratio is $\alpha + \varepsilon$, where $\alpha \approx 1.691$, and in addition, that our analysis is essentially tight. The running time is polynomial in $n$ and $D$. The approximation ratio improves on the ratio of 2 due to Nagarajan and Ravi [Networks'12]. The main novelty of this paper lies in the analysis of the algorithm. It relies on a reduction from the tree DVRP to the bounded space online bin packing problem via a new notion of reduced length.
- Published
- 2022
7. Unsplittable Euclidean Capacitated Vehicle Routing: A $(2+\epsilon)$-Approximation Algorithm
- Author
-
Grandoni, Fabrizio, Mathieu, Claire, and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the unsplittable capacitated vehicle routing problem, we are given a metric space with a vertex called depot and a set of vertices called terminals. Each terminal is associated with a positive demand between 0 and 1. The goal is to find a minimum length collection of tours starting and ending at the depot such that the demand of each terminal is covered by a single tour (i.e., the demand cannot be split), and the total demand of the terminals in each tour does not exceed the capacity of 1. Our main result is a polynomial-time $(2+\epsilon)$-approximation algorithm for this problem in the two-dimensional Euclidean plane, i.e., for the special case where the terminals and the depot are associated with points in the Euclidean plane and their distances are defined accordingly. This improves on recent work by Blauth, Traub, and Vygen [IPCO'21] and Friggstad, Mousavi, Rahgoshay, and Salavatipour [IPCO'22].
- Published
- 2022
8. Correlation Clustering and Two-Edge-Connected Augmentation for Planar Graphs
- Author
-
Klein, Philip N., Mathieu, Claire, and Zhou, Hang
- Published
- 2023
- Full Text
- View/download PDF
9. A Tight $(1.5+\epsilon)$-Approximation for Unsplittable Capacitated Vehicle Routing on Trees
- Author
-
Mathieu, Claire and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the unsplittable capacitated vehicle routing problem (UCVRP) on trees, we are given a rooted tree with edge weights and a subset of vertices of the tree called terminals. Each terminal is associated with a positive demand between 0 and 1. The goal is to find a minimum length collection of tours starting and ending at the root of the tree such that the demand of each terminal is covered by a single tour (i.e., the demand cannot be split), and the total demand of the terminals in each tour does not exceed the capacity of 1. For the special case when all terminals have equal demands, a long line of research culminated in a quasi-polynomial time approximation scheme [Jayaprakash and Salavatipour, SODA 2022] and a polynomial time approximation scheme [Mathieu and Zhou, ICALP 2022]. In this work, we study the general case when the terminals have arbitrary demands. Our main contribution is a polynomial time $(1.5+\epsilon)$-approximation algorithm for the UCVRP on trees. This is the first improvement upon the 2-approximation algorithm more than 30 years ago [Labb\'e, Laporte, and Mercure, Operations Research, 1991]. Our approximation ratio is essentially best possible, since it is NP-hard to approximate the UCVRP on trees to better than a 1.5 factor.
- Published
- 2022
10. A Simple Algorithm for Graph Reconstruction
- Author
-
Mathieu, Claire and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
How efficiently can we find an unknown graph using distance queries between its vertices? We assume that the unknown graph is connected, unweighted, and has bounded degree. The goal is to find every edge in the graph. This problem admits a reconstruction algorithm based on multi-phase Voronoi-cell decomposition and using $\tilde O(n^{3/2})$ distance queries. In our work, we analyze a simple reconstruction algorithm. We show that, on random $\Delta$-regular graphs, our algorithm uses $\tilde O(n)$ distance queries. As by-products, we can reconstruct those graphs using $O(\log^2 n)$ queries to an all-distances oracle or $\tilde O(n)$ queries to a betweenness oracle, and we bound the metric dimension of those graphs by $\log^2 n$. Our reconstruction algorithm has a very simple structure, and is highly parallelizable. On general graphs of bounded degree, our reconstruction algorithm has subquadratic query complexity.
- Published
- 2021
- Full Text
- View/download PDF
11. A PTAS for Capacitated Vehicle Routing on Trees
- Author
-
Mathieu, Claire and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We give a polynomial time approximation scheme (PTAS) for the unit demand capacitated vehicle routing problem (CVRP) on trees, for the entire range of the tour capacity. The result extends to the splittable CVRP., Comment: Accepted for publication at ICALP 2022
- Published
- 2021
12. Constrained School Choice with Incomplete Information
- Author
-
Gimbert, Hugo, Mathieu, Claire, and Mauras, Simon
- Subjects
Computer Science - Computer Science and Game Theory ,Economics - Theoretical Economics - Abstract
School choice is the two-sided matching market where students (on one side) are to be matched with schools (on the other side) based on their mutual preferences. The classical algorithm to solve this problem is the celebrated deferred acceptance procedure, proposed by Gale and Shapley. After both sides have revealed their mutual preferences, the algorithm computes an optimal stable matching. Most often in practice, notably when the process is implemented by a national clearinghouse and thousands of schools enter the market, there is a quota on the number of applications that a student can submit: students have to perform a partial revelation of their preferences, based on partial information on the market. We model this situation by drawing each student type from a publicly known distribution and study Nash equilibria of the corresponding Bayesian game. We focus on symmetric equilibria, in which all students play the same strategy. We show existence of these equilibria in the general case, and provide two algorithms to compute such equilibria under additional assumptions, including the case where schools have identical preferences over students.
- Published
- 2021
13. Probabilistic Analysis of Euclidean Capacitated Vehicle Routing
- Author
-
Mathieu, Claire and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We give a probabilistic analysis of the unit-demand Euclidean capacitated vehicle routing problem in the random setting, where the input distribution consists of $n$ unit-demand customers modeled as independent, identically distributed uniform random points in the two-dimensional plane. The objective is to visit every customer using a set of routes of minimum total length, such that each route visits at most $k$ customers, where $k$ is the capacity of a vehicle. All of the following results are in the random setting and hold asymptotically almost surely. The best known polynomial-time approximation for this problem is the iterated tour partitioning (ITP) algorithm, introduced in 1985 by Haimovich and Rinnooy Kan. They showed that the ITP algorithm is near-optimal when $k$ is either $o(\sqrt{n})$ or $\omega(\sqrt{n})$, and they asked whether the ITP algorithm was also effective in the intermediate range. In this work, we show that when $k=\sqrt{n}$, the ITP algorithm is at best a $(1+c_0)$-approximation for some positive constant $c_0$. On the other hand, the approximation ratio of the ITP algorithm was known to be at most $0.995+\alpha$ due to Bompadre, Dror, and Orlin, where $\alpha$ is the approximation ratio of an algorithm for the traveling salesman problem. In this work, we improve the upper bound on the approximation ratio of the ITP algorithm to $0.915+\alpha$. Our analysis is based on a new lower bound on the optimal cost for the metric capacitated vehicle routing problem, which may be of independent interest.
- Published
- 2021
- Full Text
- View/download PDF
14. Apportionment with Parity Constraints
- Author
-
Mathieu, Claire and Verdugo, Victor
- Subjects
Mathematics - Optimization and Control ,Computer Science - Computers and Society - Abstract
In the classic apportionment problem the goal is to decide how many seats of a parliament should be allocated to each party as a result of an election. The divisor methods provide a way of solving this problem by defining a notion of proportionality guided by some rounding rule. Motivated by recent challenges in the context of electoral apportionment, we consider the question of how to allocate the seats of a parliament under parity constraints between candidate types (e.g. equal number of men and women elected) while at the same time satisfying party proportionality. We consider two different approaches for this problem. The first mechanism, that follows a greedy approach, corresponds to a recent mechanism used in the Chilean Constitutional Convention 2021 election. We analyze this mechanism from a theoretical point of view. The second mechanism follows the idea of biproportionality introduced by Balinski and Demange [Math. Program. 1989, Math. Oper. Res. 1989]. In contrast with the classic biproportional method by Balinski and Demange, this mechanism is ruled by two levels of proportionality: Proportionality is satisfied at the level of parties by means of a divisor method, and then biproportionality is used to decide the number of candidates allocated to each type and party. We provide a theoretical analysis of this mechanism, making progress on the theoretical understanding of methods with two levels of proportionality. A typical benchmark used in the context of two-dimensional apportionment is the fair share (a.k.a matrix scaling), which corresponds to an ideal fractional biproportional solution. We provide lower bounds on the distance between these two types of solutions, and we explore their consequences in the context of two-dimensional apportionment.
- Published
- 2021
15. Competitive Data-Structure Dynamization
- Author
-
Mathieu, Claire, Rajaraman, Rajmohan, Young, Neal E., and Yousefi, Arman
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Databases ,68W27, 68P15, 68R05 ,F.1.2 ,H.2.4 - Abstract
Data-structure dynamization is a general approach for making static data structures dynamic. It is used extensively in geometric settings and in the guise of so-called merge (or compaction) policies in big-data databases such as Google Bigtable and LevelDB (our focus). Previous theoretical work is based on worst-case analyses for uniform inputs -- insertions of one item at a time and constant read rate. In practice, merge policies must not only handle batch insertions and varying read/write ratios, they can take advantage of such non-uniformity to reduce cost on a per-input basis. To model this, we initiate the study of data-structure dynamization through the lens of competitive analysis, via two new online set-cover problems. For each, the input is a sequence of disjoint sets of weighted items. The sets are revealed one at a time. The algorithm must respond to each with a set cover that covers all items revealed so far. It obtains the cover incrementally from the previous cover by adding one or more sets and optionally removing existing sets. For each new set the algorithm incurs build cost equal to the weight of the items in the set. In the first problem the objective is to minimize total build cost plus total query cost, where the algorithm incurs a query cost at each time $t$ equal to the current cover size. In the second problem, the objective is to minimize the build cost while keeping the query cost from exceeding $k$ (a given parameter) at any time. We give deterministic online algorithms for both variants, with competitive ratios of $\Theta(\log^* n)$ and $k$, respectively. The latter ratio is optimal for the second variant., Comment: Conference version in SODA (2021). Journal version in ACM TALG (accepted June 2024)
- Published
- 2020
- Full Text
- View/download PDF
16. Large Very Dense Subgraphs in a Stream of Edges
- Author
-
Mathieu, Claire and de Rougemont, Michel
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Computational Complexity - Abstract
We study the detection and the reconstruction of a large very dense subgraph in a social graph with $n$ nodes and $m$ edges given as a stream of edges, when the graph follows a power law degree distribution, in the regime when $m=O(n. \log n)$. A subgraph $S$ is very dense if it has $\Omega(|S|^2)$ edges. We uniformly sample the edges with a Reservoir of size $k=O(\sqrt{n}.\log n)$. Our detection algorithm checks whether the Reservoir has a giant component. We show that if the graph contains a very dense subgraph of size $\Omega(\sqrt{n})$, then the detection algorithm is almost surely correct. On the other hand, a random graph that follows a power law degree distribution almost surely has no large very dense subgraph, and the detection algorithm is almost surely correct. We define a new model of random graphs which follow a power law degree distribution and have large very dense subgraphs. We then show that on this class of random graphs we can reconstruct a good approximation of the very dense subgraph with high probability. We generalize these results to dynamic graphs defined by sliding windows in a stream of edges., Comment: 32 pages
- Published
- 2020
- Full Text
- View/download PDF
17. Approximating maximum integral multiflows on bounded genus graphs
- Author
-
Huang, Chien-chung, Mari, Mathieu, Mathieu, Claire, and Vygen, Jens
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Computational Geometry ,Computer Science - Discrete Mathematics ,57M15, 05C38, 05C62, 90C27, 90C35 - Abstract
We devise the first constant-factor approximation algorithm for finding an integral multi-commodity flow of maximum total value for instances where the supply graph together with the demand edges can be embedded on an orientable surface of bounded genus. This extends recent results for planar instances. Our techniques include an uncrossing algorithm, which is significantly more difficult than in the planar case, a partition of the cycles in the support of an LP solution into free homotopy classes, and a new rounding procedure for freely homotopic non-separating cycles.
- Published
- 2020
- Full Text
- View/download PDF
18. An Approximation Algorithm for Fully Planar Edge-Disjoint Paths
- Author
-
Huang, Chien-Chung, Mari, Mathieu, Mathieu, Claire, Schewior, Kevin, and Vygen, Jens
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Discrete Mathematics ,Mathematics - Combinatorics - Abstract
We devise a constant-factor approximation algorithm for the maximization version of the edge-disjoint paths problem if the supply graph together with the demand edges form a planar graph. By planar duality this is equivalent to packing cuts in a planar graph such that each cut contains exactly one demand edge. We also show that the natural linear programming relaxations have constant integrality gap, yielding an approximate max-multiflow min-multicut theorem.
- Published
- 2020
- Full Text
- View/download PDF
19. Two-Sided Matching Markets with Correlated Random Preferences
- Author
-
Gimbert, Hugo, Mathieu, Claire, and Mauras, Simon
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Computers and Society ,Computer Science - Discrete Mathematics ,Computer Science - Computer Science and Game Theory - Abstract
Stable matching in a community consisting of men and women is a classical combinatorial problem that has been the subject of intense theoretical and empirical study since its introduction in 1962 in a seminal paper by Gale and Shapley, who designed the celebrated ``deferred acceptance'' algorithm for the problem. In the input, each participant ranks participants of the opposite type, so the input consists of a collection of permutations, representing the preference lists. A bipartite matching is unstable if some man-woman pair is blocking: both strictly prefer each other to their partner in the matching. Stability is an important economics concept in matching markets from the viewpoint of manipulability. The unicity of a stable matching implies non-manipulability, and near-unicity implies limited manipulability, thus these are mathematical properties related to the quality of stable matching algorithms. This paper is a theoretical study of the effect of correlations on approximate manipulability of stable matching algorithms. Our approach is to go beyond worst case, assuming that some of the input preference lists are drawn from a distribution. Our model encompasses a discrete probabilistic process inspired by a popularity model introduced by Immorlica and Mahdian, that provides a way to capture correlation between preference lists. Approximate manipulability is approached from several angles : when all stable partners of a person have approximately the same rank; or when most persons have a unique stable partner. Another quantity of interest is a person's number of stable partners. Our results aim to paint a picture of the manipulability of stable matchings in a ``beyond worst case'' setting.
- Published
- 2019
- Full Text
- View/download PDF
20. How to aggregate Top-lists: Approximation algorithms via scores and average ranks
- Author
-
Mathieu, Claire and Mauras, Simon
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Information Retrieval - Abstract
A top-list is a possibly incomplete ranking of elements: only a subset of the elements are ranked, with all unranked elements tied for last. Top-list aggregation, a generalization of the well-known rank aggregation problem, takes as input a collection of top-lists and aggregates them into a single complete ranking, aiming to minimize the number of upsets (pairs ranked in opposite order in the input and in the output). In this paper, we give simple approximation algorithms for top-list aggregation. * We generalize the footrule algorithm for rank aggregation. * Using inspiration from approval voting, we define the score of an element as the frequency with which it is ranked, i.e. appears in an input top-list. We reinterpret Ailon's RepeatChoice algorithm for top-list aggregation using the score of an element and its average rank given that it is ranked. * Using average ranks, we generalize and analyze Borda's algorithm for rank aggregation. * We design a simple 2-phase variant of the Generalized Borda's algorithm, roughly sorting by scores and breaking ties by average ranks. * We then design another 2-phase variant in which in order to break ties we use, as a black box, the Mathieu-Schudy PTAS for rank aggregation, yielding a PTAS for top-list aggregation. * Finally, we discuss the special case in which all input lists have constant length., Comment: To appear in SODA'20
- Published
- 2018
- Full Text
- View/download PDF
21. Instance-Optimality in the Noisy Value-and Comparison-Model --- Accept, Accept, Strong Accept: Which Papers get in?
- Author
-
Cohen-Addad, Vincent, Mallmann-Trenn, Frederik, and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Databases ,Computer Science - Machine Learning - Abstract
Motivated by crowdsourced computation, peer-grading, and recommendation systems, Braverman, Mao and Weinberg [STOC'16] studied the \emph{query} and \emph{round} complexity of fundamental problems such as finding the maximum (\textsc{max}), finding all elements above a certain value (\textsc{threshold-$v$}) or computing the top$-k$ elements (\textsc{Top}-$k$) in a noisy environment. For example, consider the task of selecting papers for a conference. This task is challenging due the crowdsourcing nature of peer reviews: the results of reviews are noisy and it is necessary to parallelize the review process as much as possible. We study the noisy value model and the noisy comparison model: In the \emph{noisy value model}, a reviewer is asked to evaluate a single element: "What is the value of paper $i$?" (\eg accept). In the \emph{noisy comparison model} (introduced in the seminal work of Feige, Peleg, Raghavan and Upfal [SICOMP'94]) a reviewer is asked to do a pairwise comparison: "Is paper $i$ better than paper $j$?" In this paper, we show optimal worst-case query complexity for the \textsc{max},\textsc{threshold-$v$} and \textsc{Top}-$k$ problems. For \textsc{max} and \textsc{Top}-$k$, we obtain optimal worst-case upper and lower bounds on the round vs query complexity in both models. For \textsc{threshold}-$v$, we obtain optimal query complexity and nearly-optimal round complexity, where $k$ is the size of the output) for both models. We then go beyond the worst-case and address the question of the importance of knowledge of the instance by providing, for a large range of parameters, instance-optimal algorithms with respect to the query complexity. Furthermore, we show that the value model is strictly easier than the comparison model.
- Published
- 2018
22. Polarization dependent chemistry of ferroelectric BaTiO3 (001) domains
- Author
-
Mi, Yanyu, Geneste, Gregory, Rault, Julien, Mathieu, Claire, Pancotti, Alexandre, and Barrett, Nicholas
- Subjects
Condensed Matter - Materials Science ,Physics - Chemical Physics - Abstract
Recent works suggest that the surface chemistry, in particular, the presence of oxygen vacancies can affect the polarization in a ferroelectric material. This should, in turn, influence the domain ordering driven by the need to screen the depolarizing field. Here we show using density functional theory that the presence of oxygen vacancies at the surface of BaTiO3 (001) preferentially stabilizes an inward pointing, P-, polarization. Mirror electron microscopy measurements of the domain ordering confirm the theoretical results., Comment: 11 pages, 6 figures
- Published
- 2018
- Full Text
- View/download PDF
23. Skyline Computation with Noisy Comparisons
- Author
-
Groz, Benoît, Mallmann-Trenn, Frederik, Mathieu, Claire, and Verdugo, Victor
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
Given a set of $n$ points in a $d$-dimensional space, we seek to compute the skyline, i.e., those points that are not strictly dominated by any other point, using few comparisons between elements. We adopt the noisy comparison model [FRPU94] where comparisons fail with constant probability and confidence can be increased through independent repetitions of a comparison. In this model motivated by Crowdsourcing applications, Groz & Milo [GM15] show three bounds on the query complexity for the skyline problem. We improve significantly on that state of the art and provide two output-sensitive algorithms computing the skyline with respective query complexity $O(nd\log (dk/\delta))$ and $O(ndk\log (k/\delta))$ where $k$ is the size of the skyline and $\delta$ the expected probability that our algorithm fails to return the correct answer. These results are tight for low dimensions.
- Published
- 2017
24. Two-Sided Matching Markets with Strongly Correlated Preferences
- Author
-
Gimbert, Hugo, Mathieu, Claire, Mauras, Simon, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bampis, Evripidis, editor, and Pagourtzis, Aris, editor
- Published
- 2021
- Full Text
- View/download PDF
25. Dynamic clustering to minimize the sum of radii
- Author
-
Henzinger, Monika, Leniowski, Dariusz, and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms ,G.1.6 - Abstract
In this paper, we study the problem of opening centers to cluster a set of clients in a metric space so as to minimize the sum of the costs of the centers and of the cluster radii, in a dynamic environment where clients arrive and depart, and the solution must be updated efficiently while remaining competitive with respect to the current optimal solution. We call this dynamic sum-of-radii clustering problem. We present a data structure that maintains a solution whose cost is within a constant factor of the cost of an optimal solution in metric spaces with bounded doubling dimension and whose worst-case update time is logarithmic in the parameters of the problem., Comment: 10 pages, ESA 2017
- Published
- 2017
26. Hierarchical Clustering: Objective Functions and Algorithms
- Author
-
Cohen-Addad, Vincent, Kanade, Varun, Mallmann-Trenn, Frederik, and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Learning - Abstract
Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly finer granularity. Motivated by the fact that most work on hierarchical clustering was based on providing algorithms, rather than optimizing a specific objective, Dasgupta framed similarity-based hierarchical clustering as a combinatorial optimization problem, where a `good' hierarchical clustering is one that minimizes some cost function. He showed that this cost function has certain desirable properties. We take an axiomatic approach to defining `good' objective functions for both similarity and dissimilarity-based hierarchical clustering. We characterize a set of "admissible" objective functions (that includes Dasgupta's one) that have the property that when the input admits a `natural' hierarchical clustering, it has an optimal value. Equipped with a suitable objective function, we analyze the performance of practical algorithms, as well as develop better algorithms. For similarity-based hierarchical clustering, Dasgupta showed that the divisive sparsest-cut approach achieves an $O(\log^{3/2} n)$-approximation. We give a refined analysis of the algorithm and show that it in fact achieves an $O(\sqrt{\log n})$-approx. (Charikar and Chatziafratis independently proved that it is a $O(\sqrt{\log n})$-approx.). This improves upon the LP-based $O(\log n)$-approx. of Roy and Pokutta. For dissimilarity-based hierarchical clustering, we show that the classic average-linkage algorithm gives a factor 2 approx., and provide a simple and better algorithm that gives a factor 3/2 approx.. Finally, we consider `beyond-worst-case' scenario through a generalisation of the stochastic block model for hierarchical clustering. We show that Dasgupta's cost function has desirable properties for these inputs and we provide a simple 1 + o(1)-approximation in this setting.
- Published
- 2017
27. Optimization of Bootstrapping in Circuits
- Author
-
Benhamouda, Fabrice, Lepoint, Tancrède, Mathieu, Claire, and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Cryptography and Security - Abstract
In 2009, Gentry proposed the first Fully Homomorphic Encryption (FHE) scheme, an extremely powerful cryptographic primitive that enables to perform computations, i.e., to evaluate circuits, on encrypted data without decrypting them first. This has many applications, in particular in cloud computing. In all currently known FHE schemes, encryptions are associated to some (non-negative integer) noise level, and at each evaluation of an AND gate, the noise level increases. This is problematic because decryption can only work if the noise level stays below some maximum level $L$ at every gate of the circuit. To ensure that property, it is possible to perform an operation called \emph{bootstrapping} to reduce the noise level. However, bootstrapping is time-consuming and has been identified as a critical operation. This motivates a new problem in discrete optimization, that of choosing where in the circuit to perform bootstrapping operations so as to control the noise level; the goal is to minimize the number of bootstrappings in circuits. In this paper, we formally define the \emph{bootstrap problem}, we design a polynomial-time $L$-approximation algorithm using a novel method of rounding of a linear program, and we show a matching hardness result: $(L-\epsilon)$-inapproximability for any $\epsilon>0$.
- Published
- 2016
28. Skyline Computation with Noisy Comparisons
- Author
-
Groz, Benoît, Mallmann-Trenn, Frederik, Mathieu, Claire, Verdugo, Victor, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gąsieniec, Leszek, editor, Klasing, Ralf, editor, and Radzik, Tomasz, editor
- Published
- 2020
- Full Text
- View/download PDF
29. Local search yields approximation schemes for k-means and k-median in Euclidean and minor-free metrics
- Author
-
Cohen-Addad, Vincent, Klein, Philip N., and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Computational Geometry - Abstract
We give the first polynomial-time approximation schemes (PTASs) for the following problems: (1) uniform facility location in edge-weighted planar graphs; (2) $k$-median and $k$-means in edge-weighted planar graphs; (3) $k$-means in Euclidean spaces of bounded dimension. Our first and second results extend to minor-closed families of graphs. All our results extend to cost functions that are the $p$-th power of the shortest-path distance. The algorithm is local search where the local neighborhood of a solution $S$ consists of all solutions obtained from $S$ by removing and adding $1/\epsilon^{O(1)}$ centers.
- Published
- 2016
30. The Unreasonable Success of Local Search: Geometric Optimization
- Author
-
Cohen-Addad, Vincent and Mathieu, Claire
- Subjects
Computer Science - Computational Geometry ,Computer Science - Data Structures and Algorithms - Abstract
What is the effectiveness of local search algorithms for geometric problems in the plane? We prove that local search with neighborhoods of magnitude $1/\epsilon^c$ is an approximation scheme for the following problems in the Euclidian plane: TSP with random inputs, Steiner tree with random inputs, facility location (with worst case inputs), and bicriteria $k$-median (also with worst case inputs). The randomness assumption is necessary for TSP.
- Published
- 2014
31. Mixed preferential attachment model: Homophily and minorities in social networks
- Author
-
Avin, Chen, Daltrophe, Hadassa, Keller, Barbara, Lotker, Zvi, Mathieu, Claire, Peleg, David, and Pignolet, Yvonne-Anne
- Published
- 2020
- Full Text
- View/download PDF
32. Dynamic Clustering to Minimize the Sum of Radii
- Author
-
Henzinger, Monika, Leniowski, Dariusz, and Mathieu, Claire
- Published
- 2020
- Full Text
- View/download PDF
33. Bigtable Merge Compaction
- Author
-
Mathieu, Claire, Staelin, Carl, Young, Neal E., and Yousefi, Arman
- Subjects
Computer Science - Data Structures and Algorithms ,68W27, 68P15, 68R05 ,F.1.2 ,H.2.4 - Abstract
NoSQL databases are widely used for massive data storage and real-time web applications. Yet important aspects of these data structures are not well understood. For example, NoSQL databases write most of their data to a collection of files on disk, meanwhile periodically compacting subsets of these files. A compaction policy must choose which files to compact, and when to compact them, without knowing the future workload. Although these choices can affect computational efficiency by orders of magnitude, existing literature lacks tools for designing and analyzing online compaction policies --- policies are now chosen largely by trial and error. Here we introduce tools for the design and analysis of compaction policies for Google Bigtable, propose new policies, give average-case and worst-case competitive analyses, and present preliminary empirical benchmarks.
- Published
- 2014
34. Lower bounds for testing digraph connectivity with one-pass streaming algorithms
- Author
-
Borradaile, Glencora, Mathieu, Claire, and Migler, Theresa
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Computational Complexity - Abstract
In this note, we show that three graph properties - strong connectivity, acyclicity, and reachability from a vertex $s$ to all vertices - each require a working memory of $\Omega (\epsilon m)$ on a graph with $m$ edges to be determined correctly with probability greater than $(1+\epsilon)/2$., Comment: Added some references to previous work, removed the part of the result that was already known before, and changed the label of the result from "Theorem" to "Lemma"
- Published
- 2014
35. Facility Location in Evolving Metrics
- Author
-
Eisenstat, David, Mathieu, Claire, and Schabanel, Nicolas
- Subjects
Computer Science - Social and Information Networks ,Computer Science - Data Structures and Algorithms - Abstract
Understanding the dynamics of evolving social or infrastructure networks is a challenge in applied areas such as epidemiology, viral marketing, or urban planning. During the past decade, data has been collected on such networks but has yet to be fully analyzed. We propose to use information on the dynamics of the data to find stable partitions of the network into groups. For that purpose, we introduce a time-dependent, dynamic version of the facility location problem, that includes a switching cost when a client's assignment changes from one facility to another. This might provide a better representation of an evolving network, emphasizing the abrupt change of relationships between subjects rather than the continuous evolution of the underlying network. We show that in realistic examples this model yields indeed better fitting solutions than optimizing every snapshot independently. We present an $O(\log nT)$-approximation algorithm and a matching hardness result, where $n$ is the number of clients and $T$ the number of time steps. We also give an other algorithms with approximation ratio $O(\log nT)$ for the variant where one pays at each time step (leasing) for each open facility.
- Published
- 2014
36. Energy-efficient algorithms for non-preemptive speed-scaling
- Author
-
Cohen-Addad, Vincent, Li, Zhentao, Mathieu, Claire, and Millis, Ioannis
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We improve complexity bounds for energy-efficient speed scheduling problems for both the single processor and multi-processor cases. Energy conservation has become a major concern, so revisiting traditional scheduling problems to take into account the energy consumption has been part of the agenda of the scheduling community for the past few years. We consider the energy minimizing speed scaling problem introduced by Yao et al. where we wish to schedule a set of jobs, each with a release date, deadline and work volume, on a set of identical processors. The processors may change speed as a function of time and the energy they consume is the $\alpha$th power of its speed. The objective is then to find a feasible schedule which minimizes the total energy used. We show that in the setting with an arbitrary number of processors where all work volumes are equal, there is a $2(1+\varepsilon)(5(1+\varepsilon))^{\alpha -1}\tilde{B}_{\alpha}=O_{\alpha}(1)$ approximation algorithm, where $\tilde{B}_{\alpha}$ is the generalized Bell number. This is the first constant factor algorithm for this problem. This algorithm extends to general unequal processor-dependent work volumes, up to losing a factor of $(\frac{(1+r)r}{2})^{\alpha}$ in the approximation, where $r$ is the maximum ratio between two work volumes. We then show this latter problem is APX-hard, even in the special case when all release dates and deadlines are equal and $r$ is 4. In the single processor case, we introduce a new linear programming formulation of speed scaling and prove that its integrality gap is at most $12^{\alpha -1}$. As a corollary, we obtain a $(12(1+\varepsilon))^{\alpha -1}$ approximation algorithm where there is a single processor, improving on the previous best bound of $2^{\alpha-1}(1+\varepsilon)^{\alpha}\tilde{B}_{\alpha}$ when $\alpha \ge 25$.
- Published
- 2014
37. Near-Linear Query Complexity for Graph Inference
- Author
-
Kannan, Sampath, Mathieu, Claire, and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
How efficiently can we find an unknown graph using distance or shortest path queries between its vertices? Let $G = (V,E)$ be an unweighted, connected graph of bounded degree. The edge set $E$ is initially unknown, and the graph can be accessed using a \emph{distance oracle}, which receives a pair of vertices $(u,v)$ and returns the distance between $u$ and $v$. In the \emph{verification} problem, we are given a hypothetical graph $\hat G = (V,\hat E)$ and want to check whether $G$ is equal to $\hat G$. We analyze a natural greedy algorithm and prove that it uses $n^{1+o(1)}$ distance queries. In the more difficult \emph{reconstruction} problem, $\hat G$ is not given, and the goal is to find the graph $G$. If the graph can be accessed using a \emph{shortest path oracle}, which returns not just the distance but an actual shortest path between $u$ and $v$, we show that extending the idea of greedy gives a reconstruction algorithm that uses $n^{1+o(1)}$ shortest path queries. When the graph has bounded treewidth, we further bound the query complexity of the greedy algorithms for both problems by $\tilde O(n)$. When the graph is chordal, we provide a randomized algorithm for reconstruction using $\tilde O(n)$ distance queries.
- Published
- 2014
38. Competitive Data-Structure Dynamization
- Author
-
Mathieu, Claire, primary, Rajaraman, Rajmohan, additional, Young, Neal E., additional, and Yousefi, Arman, additional
- Published
- 2021
- Full Text
- View/download PDF
39. Two-Sided Matching Markets with Strongly Correlated Preferences
- Author
-
Gimbert, Hugo, primary, Mathieu, Claire, additional, and Mauras, Simon, additional
- Published
- 2021
- Full Text
- View/download PDF
40. First-Come-First-Served for Online Slot Allocation and Huffman Coding
- Author
-
Khare, Monik, Mathieu, Claire, and Young, Neal E.
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Information Theory ,68W40, 68Q87 ,F.1.2 ,F.2.0 ,H.1.1 - Abstract
Can one choose a good Huffman code on the fly, without knowing the underlying distribution? Online Slot Allocation (OSA) models this and similar problems: There are n slots, each with a known cost. There are n items. Requests for items are drawn i.i.d. from a fixed but hidden probability distribution p. After each request, if the item, i, was not previously requested, then the algorithm (knowing the slot costs and the requests so far, but not p) must place the item in some vacant slot j(i). The goal is to minimize the sum, over the items, of the probability of the item times the cost of its assigned slot. The optimal offline algorithm is trivial: put the most probable item in the cheapest slot, the second most probable item in the second cheapest slot, etc. The optimal online algorithm is First Come First Served (FCFS): put the first requested item in the cheapest slot, the second (distinct) requested item in the second cheapest slot, etc. The optimal competitive ratios for any online algorithm are 1+H(n-1) ~ ln n for general costs and 2 for concave costs. For logarithmic costs, the ratio is, asymptotically, 1: FCFS gives cost opt + O(log opt). For Huffman coding, FCFS yields an online algorithm (one that allocates codewords on demand, without knowing the underlying probability distribution) that guarantees asymptotically optimal cost: at most opt + 2 log(1+opt) + 2., Comment: ACM-SIAM Symposium on Discrete Algorithms (SODA) 2014
- Published
- 2013
41. Graph Reconstruction via Distance Oracles
- Author
-
Mathieu, Claire and Zhou, Hang
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We study the problem of reconstructing a hidden graph given access to a distance oracle. We design randomized algorithms for the following problems: reconstruction of a degree bounded graph with query complexity $\tilde{O}(n^{3/2})$; reconstruction of a degree bounded outerplanar graph with query complexity $\tilde{O}(n)$; and near-optimal approximate reconstruction of a general graph.
- Published
- 2013
42. A polynomial-time approximation scheme for Euclidean Steiner forest
- Author
-
Borradaile, Glencora, Klein, Philip, and Mathieu, Claire
- Subjects
Computer Science - Computational Geometry ,Computer Science - Data Structures and Algorithms - Abstract
We give a randomized O(n polylog n)-time approximation scheme for the Steiner forest problem in the Euclidean plane. For every fixed eps > 0 and given n terminals in the plane with connection requests between some pairs of terminals, our scheme finds a (1 + eps)-approximation to the minimum-length forest that connects every requested pair of terminals., Comment: This version is more recent than that appearing in the FOCS proceedings. The partition step has been corrected and the overall presentation has been clarified and formalized. This paper has been accepted to TALG
- Published
- 2013
43. The min mean-weight cycle in a random network
- Author
-
Mathieu, Claire and Wilson, David B.
- Subjects
Mathematics - Probability ,Computer Science - Data Structures and Algorithms ,05C80, 68Q87 - Abstract
The mean weight of a cycle in an edge-weighted graph is the sum of the cycle's edge weights divided by the cycle's length. We study the minimum mean-weight cycle on the complete graph on n vertices, with random i.i.d. edge weights drawn from an exponential distribution with mean 1. We show that the probability of the min mean weight being at most c/n tends to a limiting function of c which is analytic for c<=1/e, discontinuous at c=1/e, and equal to 1 for c>1/e. We further show that if the min mean weight is <=1/(en), then the length of the relevant cycle is Theta_p(1) (i.e., it has a limiting probability distribution which does not scale with n), but that if the min mean weight is >1/(en), then the relevant cycle almost always has mean weight (1+o(1))/(en) and length at least (2/pi^2-o(1)) log^2 n log log n., Comment: 21 pages, 1 figure
- Published
- 2012
- Full Text
- View/download PDF
44. Maximum Matching in Semi-Streaming with Few Passes
- Author
-
Konrad, Christian, Magniez, Frédéric, and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
In the semi-streaming model, an algorithm receives a stream of edges of a graph in arbitrary order and uses a memory of size $O(n \mbox{ polylog } n)$, where $n$ is the number of vertices of a graph. In this work, we present semi-streaming algorithms that perform one or two passes over the input stream for maximum matching with no restrictions on the input graph, and for the important special case of bipartite graphs that we refer to as maximum bipartite matching (MBM). The Greedy matching algorithm performs one pass over the input and outputs a $1/2$ approximation. Whether there is a better one-pass algorithm has been an open question since the appearance of the first paper on streaming algorithms for matching problems in 2005 [Feigenbaum et al., SODA 2005]. We make the following progress on this problem: In the one-pass setting, we show that there is a deterministic semi-streaming algorithm for MBM with expected approximation factor $1/2+0.005$, assuming that edges arrive one by one in (uniform) random order. We extend this algorithm to general graphs, and we obtain a $1/2+0.003$ approximation. In the two-pass setting, we do not require the random arrival order assumption (the edge stream is in arbitrary order). We present a simple randomized two-pass semi-streaming algorithm for MBM with expected approximation factor $1/2 + 0.019$. Furthermore, we discuss a more involved deterministic two-pass semi-streaming algorithm for MBM with approximation factor $1/2 + 0.019$ and a generalization of this algorithm to general graphs with approximation factor $1/2 + 0.0071$., Comment: Algorithms for general graphs have been added
- Published
- 2011
45. An efficient polynomial-time approximation scheme for Steiner forest in planar graphs
- Author
-
Eisenstat, David, Klein, Philip, and Mathieu, Claire
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
We give an $O(n \log^3 n)$ approximation scheme for Steiner forest in planar graphs, improving on the previous approximation scheme for this problem, which runs in $O(n^{f(\epsilon)})$ time., Comment: added material on balanced branch decompositions; fixed theorem references
- Published
- 2011
46. Integrality Gaps of Linear and Semi-definite Programming Relaxations for Knapsack
- Author
-
Karlin, Anna R., Mathieu, Claire, and Nguyen, C. Thach
- Subjects
Computer Science - Computational Complexity ,Computer Science - Discrete Mathematics ,Mathematics - Optimization and Control - Abstract
In this paper, we study the integrality gap of the Knapsack linear program in the Sherali- Adams and Lasserre hierarchies. First, we show that an integrality gap of 2 - {\epsilon} persists up to a linear number of rounds of Sherali-Adams, despite the fact that Knapsack admits a fully polynomial time approximation scheme [27,33]. Second, we show that the Lasserre hierarchy closes the gap quickly. Specifically, after t rounds of Lasserre, the integrality gap decreases to t/(t - 1). To the best of our knowledge, this is the first positive result that uses more than a small number of rounds in the Lasserre hierarchy. Our proof uses a decomposition theorem for the Lasserre hierarchy, which may be of independent interest., Comment: 15 pages
- Published
- 2010
47. Online Correlation Clustering
- Author
-
Mathieu, Claire, Sankur, Ocan, and Schudy, Warren
- Subjects
Computer Science - Data Structures and Algorithms ,F.2.2 - Abstract
We study the online clustering problem where data items arrive in an online fashion. The algorithm maintains a clustering of data items into similarity classes. Upon arrival of v, the relation between v and previously arrived items is revealed, so that for each u we are told whether v is similar to u. The algorithm can create a new cluster for v and merge existing clusters. When the objective is to minimize disagreements between the clustering and the input, we prove that a natural greedy algorithm is O(n)-competitive, and this is optimal. When the objective is to maximize agreements between the clustering and the input, we prove that the greedy algorithm is .5-competitive; that no online algorithm can be better than .834-competitive; we prove that it is possible to get better than 1/2, by exhibiting a randomized algorithm with competitive ratio .5+c for a small positive fixed constant c., Comment: 12 pages, 1 figure
- Published
- 2010
48. Maximizing profit using recommender systems
- Author
-
Das, Aparna, Mathieu, Claire, and Ricketts, Daniel
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval - Abstract
Traditional recommendation systems make recommendations based solely on the customer's past purchases, product ratings and demographic data without considering the profitability the items being recommended. In this work we study the question of how a vendor can directly incorporate the profitability of items into its recommender so as to maximize its expected profit while still providing accurate recommendations. Our approach uses the output of any traditional recommender system and adjust them according to item profitabilities. Our approach is parameterized so the vendor can control how much the recommendation incorporating profits can deviate from the traditional recommendation. We study our approach under two settings and show that it achieves approximately 22% more profit than traditional recommendations.
- Published
- 2009
49. First Come First Served for Online Slot Allocation and Huffman Coding
- Author
-
Khare, Monik, Mathieu, Claire, and Young, Neal E
- Subjects
cs.DS ,cs.IT ,math.IT ,68W40 ,68Q87 ,F.1.2 ,F.2.0 ,H.1.1 - Abstract
Can one choose a good Huffman code on the fly, without knowing the underlying distribution? Online Slot Allocation (OS A) models this and similar problems: There are n slots, each with a known cost. There are n items. Requests for items are drawn i.i.d. from a fixed but hidden probability distribution p. After each request, if the item, i, was not previously requested, then the algorithm (knowing c and the requests so far, but not p) must place the item in some vacant slot ji, at cost Pic(ji). The goal is to minimize the total cost The optimal offline algorithm is trivial: put the most probable item in the cheapest slot, the second most probable item in the second cheapest slot, etc. The optimal online algorithm is First Come First Served (FCFS): put the first requested item in the cheapest slot, the second (distinct) requested item in the second cheapest slot, etc. The optimal competitive ratios for any online algorithm are 1 4- Hn-1 ∼ In n for general costs and 2 for concave costs. For logarithmic costs, the ratio is, asymptotically, 1: FCFS gives cost OPT + O(logOPT). For Huffman coding, FCFS yields an online algo-rithm (one that allocates codewords on demand, without knowing the underlying probability distribution) that guarantees asymptotically optimal cost: At most OPT + 2 log2(l + OPT) + 2. Copyright © 2014 by the Society for Industrial and Applied Mathematics.
- Published
- 2014
50. A quasi-polynomial time approximation scheme for Euclidean capacitated vehicle routing
- Author
-
Das, Aparna and Mathieu, Claire
- Subjects
Computer Science - Discrete Mathematics ,Computer Science - Data Structures and Algorithms - Abstract
In the capacitated vehicle routing problem, introduced by Dantzig and Ramser in 1959, we are given the locations of n customers and a depot, along with a vehicle of capacity k, and wish to find a minimum length collection of tours, each starting from the depot and visiting at most k customers, whose union covers all the customers. We give a quasi-polynomial time approximation scheme for the setting where the customers and the depot are on the plane, and distances are given by the Euclidean metric.
- Published
- 2008
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.