Author: "Dan Feldman" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Dan Feldman"' showing total 133 results

Start Over Author "Dan Feldman"

133 results on '"Dan Feldman"'

101. k-Means+++: Outliers-Resistant Clustering

Author: Dan Feldman, Liat Rozenberg, and Adiel Statman
Subjects: 0209 industrial biotechnology, lcsh:T55.4-60.8, Scale (descriptive set theory), 02 engineering and technology, numerical_analysis_optimization, lcsh:QA75.5-76.95, Theoretical Computer Science, Combinatorics, Reduction (complexity), 020901 industrial engineering & automation, Approximation error, 0202 electrical engineering, electronic engineering, information engineering, lcsh:Industrial engineering. Management engineering, Cluster analysis, approximation, Mathematics, Numerical Analysis, outliers, k-means clustering, algebra_number_theory, Computational Mathematics, Metric space, Computational Theory and Mathematics, Outlier, 020201 artificial intelligence & image processing, lcsh:Electronic computers. Computer science, Constant (mathematics), clustering
Abstract: The k-means problem is to compute a set of k centers (points) that minimizes the sum of squared distances to a given set of n points in a metric space. Arguably, the most common algorithm to solve it is k-means++ which is easy to implement and provides a provably small approximation error in time that is linear in n. We generalize k-means++ to support outliers in two sense (simultaneously): (i) nonmetric spaces, e.g., M-estimators, where the distance dist(p,x) between a point p and a center x is replaced by mindist(p,x),c for an appropriate constant c that may depend on the scale of the input. (ii) k-means clustering with m&ge, 1 outliers, i.e., where the m farthest points from any given k centers are excluded from the total sum of distances. This is by using a simple reduction to the (k+m)-means clustering (with no outliers).
Published: 2020

102. Real-time EEG classification via coresets for BCI applications

Author: Eitan Netzer, Dan Feldman, and Alex Frid
Subjects: Signal Processing (eess.SP), FOS: Computer and information sciences, Computer Science - Machine Learning, 0209 industrial biotechnology, Computer science, Pipeline (computing), Interface (computing), Feature extraction, 02 engineering and technology, Machine Learning (cs.LG), 020901 industrial engineering & automation, Artificial Intelligence, Computer Science - Data Structures and Algorithms, FOS: Electrical engineering, electronic engineering, information engineering, 0202 electrical engineering, electronic engineering, information engineering, Data Structures and Algorithms (cs.DS), Electrical Engineering and Systems Science - Signal Processing, Electrical and Electronic Engineering, Brain–computer interface, business.industry, Pattern recognition, Computational geometry, Automatic summarization, Control and Systems Engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Coreset, business
Abstract: A brain-computer interface (BCI) based on the motor imagery (MI) paradigm translates one's motor intention into a control signal by classifying the Electroencephalogram (EEG) signal of different tasks. However, most existing systems either (i) use a high-quality algorithm to train the data off-line and run only classification in real-time, since the off-line algorithm is too slow, or (ii) use low-quality heuristics that are sufficiently fast for real-time training but introduces relatively large classification error. In this work, we propose a novel processing pipeline that allows real-time and parallel learning of EEG signals using high-quality but possibly inefficient algorithms. This is done by forging a link between BCI and core-sets, a technique that originated in computational geometry for handling streaming data via data summarization. We suggest an algorithm that maintains the representation such coreset tailored to handle the EEG signal which enables: (i) real time and continuous computation of the Common Spatial Pattern (CSP) feature extraction method on a coreset representation of the signal (instead on the signal itself) , (ii) improvement of the CSP algorithm efficiency with provable guarantees by applying CSP algorithm on the coreset, and (iii) real time addition of the data trials (EEG data windows) to the coreset. For simplicity, we focus on the CSP algorithm, which is a classic algorithm. Nevertheless, we expect that our coreset will be extended to other algorithms in future papers. In the experimental results we show that our system can indeed learn EEG signals in real-time for example a 64 channels setup with hundreds of time samples per second. Full open source is provided to reproduce the experiment and in the hope that it will be used and extended to more coresets and BCI applications in the future.
Published: 2020

103. Magnesium Deficiency and Minimal Hepatic Encephalopathy among Patients with Compensated Liver Cirrhosis

Author: Keren, Cohen-Hagai, Dan, Feldman, Tirza, Turani-Feldman, Ruth, Hadary, Shilo, Lotan, and Yona, Kitay-Cohen
Subjects: Liver Cirrhosis, Male, Double-Blind Method, Hepatic Encephalopathy, Humans, Female, Prospective Studies, Middle Aged, Neuropsychological Tests, Cognition Disorders, Magnesium Oxide, Magnesium Deficiency
Abstract: Magnesium is an essential intracellular cation. Magnesium deficiency is common in the general population and its prevalence among patients with cirrhosis is even higher. Correlation between serum levels and total body content is poor because most magnesium is intracellular. Minimal hepatic encephalopathy is a subclinical phase of hepatic encephalopathy with no overt symptoms. Cognitive exams can reveal minor changes in coordination, attention, and visuomotor function, whereas language and verbal intelligence are usually relatively spared.To assess the correlation between intracellular and serum magnesium levels and minimal hepatic encephalopathy.Outpatients with a diagnosis of compensated liver cirrhosis were enrolled in this randomized, double-blinded study. Patients were recruited for the study from November 2013 to January 2014, and were randomly assigned to a control (placebo) or an interventional (treated with magnesium oxide) group. Serum and intracellular magnesium levels were measured at enrollment and at the end of the study. Cognitive function was assessed by a specialized occupational therapist.Forty-two patients met the inclusion criteria, 29 of whom were included in this study. Among these, 83% had abnormal cognitive exam results compatible with minimal hepatic encephalopathy. While only 10% had hypomagnesemia, 33.3% had low levels of intracellular magnesium. Initial intracellular and serum magnesium levels positively correlated with cognitive performance.Magnesium deficiency is common among patients with compensated liver cirrhosis. We found an association between magnesium deficiency and impairment in several cognitive function tests. This finding suggests involvement of magnesium in the pathophysiology of minimal hepatic encephalopathy.
Published: 2018

104. Finding Patterns in Signals Using Lossy Text Compression

Author: Sagi Lotan, Liat Rozenberg, and Dan Feldman
Subjects: Reverse engineering, Computer science, 02 engineering and technology, Lossy compression, computer.software_genre, run-length, Theoretical Computer Science, Encoding (memory), 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), Pattern matching, data compression, robotics, Lossless compression, Numerical Analysis, signals, 020206 networking & telecommunications, Computational Mathematics, Computational Theory and Mathematics, periods, ComputerSystemsOrganization_MISCELLANEOUS, 020201 artificial intelligence & image processing, RRLE, Communications protocol, computer, Algorithm, Data compression
Abstract: Whether the source is autonomous car, robotic vacuum cleaner, or a quadcopter, signals from sensors tend to have some hidden patterns that repeat themselves. For example, typical GPS traces from a smartphone contain periodic trajectories such as &ldquo, home, work, home, work, ⋯&rdquo, Our goal in this study was to automatically reverse engineer such signals, identify their periodicity, and then use it to compress and de-noise these signals. To do so, we present a novel method of using algorithms from the field of pattern matching and text compression to represent the &ldquo, language&rdquo, in such signals. Common text compression algorithms are less tailored to handle such strings. Moreover, they are lossless, and cannot be used to recover noisy signals. To this end, we define the recursive run-length encoding (RRLE) method, which is a generalization of the well known run-length encoding (RLE) method. Then, we suggest lossy and lossless algorithms to compress and de-noise such signals. Unlike previous results, running time and optimality guarantees are proved for each algorithm. Experimental results on synthetic and real data sets are provided. We demonstrate our system by showing how it can be used to turn commercial micro air-vehicles into autonomous robots. This is by reverse engineering their unpublished communication protocols and using a laptop or on-board micro-computer to control them. Our open source code may be useful for both the community of millions of toy robots users, as well as for researchers that may extend it for further protocols.
Published: 2019

105. Aligning Points to Lines: Provable Approximations

Author: Ibrahim Jubran and Dan Feldman
Subjects: Polynomial (hyperelastic model), Discrete mathematics, Computational Geometry (cs.CG), FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, Approximation algorithm, Context (language use), Machine Learning (stat.ML), Computational geometry, Computer Science Applications, Machine Learning (cs.LG), Computational Theory and Mathematics, Statistics - Machine Learning, Euclidean geometry, Line (geometry), Convex optimization, Piecewise, Computer Science - Computational Geometry, Information Systems
Abstract: We suggest a new optimization technique for minimizing the sum $\sum_{i=1}^n f_i(x)$ of $n$ non-convex real functions that satisfy a property that we call piecewise log-Lipschitz. This is by forging links between techniques in computational geometry, combinatorics and convex optimization. As an example application, we provide the first constant-factor approximation algorithms whose running-time is polynomial in $n$ for the fundamental problem of \emph{Points-to-Lines alignment}: Given $n$ points $p_1,\cdots,p_n$ and $n$ lines $\ell_1,\cdots,\ell_n$ on the plane and $z>0$, compute the matching $\pi:[n]\to[n]$ and alignment (rotation matrix $R$ and a translation vector $t$) that minimize the sum of Euclidean distances $\sum_{i=1}^n \mathrm{dist}(Rp_i-t,\ell_{\pi(i)})^z$ between each point to its corresponding line. This problem is non-trivial even if $z=1$ and the matching $\pi$ is given. If $\pi$ is given, the running time of our algorithms is $O(n^3)$, and even near-linear in $n$ using core-sets that support: streaming, dynamic, and distributed parallel computations in poly-logarithmic update time. Generalizations for handling e.g. outliers or pseudo-distances such as $M$-estimators for the problem are also provided. Experimental results and open source code show that our provable algorithms improve existing heuristics also in practice. A companion demonstration video in the context of Augmented Reality shows how such algorithms may be used in real-time systems.
Published: 2018
Full Text: View/download PDF

106. Secure $k$-ish Nearest Neighbors Classifier

Author: Daniela Rus, Hayim Shaul, and Dan Feldman
Subjects: Computational Geometry (cs.CG), FOS: Computer and information sciences, Computer Science - Cryptography and Security, secure machine learning, Computer science, Gaussian, homomorphic encryption, 0211 other engineering and technologies, 02 engineering and technology, Encryption, k-nearest neighbors algorithm, symbols.namesake, Computer Science - Data Structures and Algorithms, 0202 electrical engineering, electronic engineering, information engineering, Data Structures and Algorithms (cs.DS), Communication complexity, General Environmental Science, Ethics, 021110 strategic, defence & security studies, Statistical distance, business.industry, Homomorphic encryption, Pattern recognition, QA75.5-76.95, BJ1-1725, ComputingMethodologies_PATTERNRECOGNITION, classification, Electronic computers. Computer science, Scalability, symbols, General Earth and Planetary Sciences, Computer Science - Computational Geometry, 020201 artificial intelligence & image processing, Artificial intelligence, business, Classifier (UML), Cryptography and Security (cs.CR)
Abstract: The k-nearest neighbors (kNN) classifier predicts a class of a query, q, by taking the majority class of its k neighbors in an existing (already classified) database, S. In secure kNN, q and S are owned by two different parties and q is classified without sharing data. In this work we present a classifier based on kNN, that is more efficient to implement with homomorphic encryption (HE). The efficiency of our classifier comes from a relaxation we make to consider κ nearest neighbors for κ ≈k with probability that increases as the statistical distance between Gaussian and the distribution of the distances from q to S decreases. We call our classifier k-ish Nearest Neighbors (k-ish NN). For the implementation we introduce double-blinded coin-toss where the bias and output of the toss are encrypted. We use it to approximate the average and variance of the distances from q to S in a scalable circuit whose depth is independent of |S|. We believe these to be of independent interest. We implemented our classifier in an open source library based on HElib and tested it on a breast tumor database. Our classifier has accuracy and running time comparable to current state of the art (non-HE) MPC solution that have better running time but worse communication complexity. It also has communication complexity similar to naive HE implementation that have worse running time.
Published: 2018
Full Text: View/download PDF

107. iDiary

Author: Cynthia Sung, Dan Feldman, Daniela Rus, and Andrew Sugaya
Subjects: Polynomial, Computer Networks and Communications, Computer science, business.industry, Parallel algorithm, computer.software_genre, GPS signals, Global Positioning System, Data mining, User interface, Projection (set theory), Coreset, business, computer, Semantic compression
Abstract: This article describes iDiary, a system that takes as input GPS data streams generated by users’ phones and turns them into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., “Where did I buy books?”) and receive textual answers based on their GPS signals. iDiary uses novel algorithms for semantic compression and trajectory clustering of massive GPS signals in parallel to compute the critical locations of a user. We encode these problems as follows. The k-segment mean is a k -piecewise linear function that minimizes the regression distance to the signal. The ( k,m )- segment mean has an additional constraint that the projection of the k segments on R d consists of only m ≤ k segments. A coreset for this problem is a smart compression of the input signal that allows computation of a (1+ε)-approximation to its k -segment or ( k,m )-segment mean in O ( n log n ) time for arbitrary constants ε, k , and m . We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n . Using an external database, we then map these locations to textual descriptions and activities so that we can apply text mining techniques on the resulting data (e.g., LSA or transportation mode recognition). We provide experimental results for both the system and algorithms and compare them to existing commercial and academic state of the art. This is the first GPS system that enables text-searchable activities from GPS data.
Published: 2015

108. HEDONIC PRICING WHEN HOUSING IS ENDOGENOUS: THE VALUE OF ACCESS TO THE TRANS-ISRAEL HIGHWAY*

Author: Dan Feldman, Daniel Felsenstein, and Michael Beenstock
Subjects: 05 social sciences, 0211 other engineering and technologies, Hedonic pricing, 021107 urban & regional planning, 02 engineering and technology, Environmental Science (miscellaneous), Development, Housing construction, Microeconomics, 0502 economics and business, Value (economics), Economics, 050207 economics, Panel data
Abstract: Standard hedonic house pricing assumes that house prices are independent of the intangible to be priced. A methodology is proposed in which the supply as well as the demand for housing depends on the intangible. The methodology is applied to value access to the Trans-Israel Highway (TIH). Using spatial panel data (2002–2008) we show that TIH had two effects on the housing market. It increased house prices in locations with greater access to TIH, and it affected housing construction. Standard hedonic pricing would have underestimated the value of access because it ignores the effects of housing construction on the intangible to be priced. House prices began to increase three years before TIH was inaugurated, but housing construction did not anticipate the inauguration of TIH.
Published: 2015

109. Automatic alignment of geographic features in contemporary vector data and historical maps

Author: Craig A. Knoblock, Yao-Yi Chiang, Vinil Jain, Weiwei Duan, Stefan Leyk, Dan Feldman, and Johannes H. Uhl
Subjects: Correctness, Training set, Geographic feature, Computer science, business.industry, 0211 other engineering and technologies, Feature recognition, Pattern recognition, 02 engineering and technology, Convolutional neural network, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Historical maps, Artificial intelligence, Focus (optics), business, Completeness (statistics), 021101 geological & geomatics engineering
Abstract: With large amounts of digital map archives becoming available, the capability to automatically extracting information from historical maps is important for many domains that require long-term geographic data, such as understanding the development of the landscape and human activities. In the previous work, we built a system to automatically recognize geographic features in historical maps using Convolutional Neural Networks (CNN). Our system uses contemporary vector data to automatically label examples of the geographic feature of interest in historical maps as training samples for the CNN model. The alignment between the vector data and geographic features in maps controls if the system can generate representative training samples, which has a significant impact on recognition performance of the system. Due to the large number of training data that the CNN model needs and tens of thousands of maps needed to be processed in an archive, manually aligning the vector data to each map in an archive is not practical. In this paper, we present an algorithm that automatically aligns vector data with geographic features in historical maps. Existing alignment approaches focus on road features and imagery and are difficult to generalize for other geographic features. Our algorithm aligns various types of geographic features in document images with the corresponding vector data. In the experiment, our alignment algorithm increased the correctness and completeness of the extracted railroad and river vector data for about 100% and 20%, respectively. For the performance of feature recognition, the aligned vector data had a 100% improvement on the precision while maintained a similar recall.
Published: 2017

110. Cloud computing for big data analytics in the Process Control Industry

Author: George Georgoulas, Miguel Castano, E. Goldin, Dan Feldman, and George Nikolakopoulos
Subjects: 0301 basic medicine, 030102 biochemistry & molecular biology, Process (engineering), business.industry, Computer science, Big data, Cloud computing, Control Engineering, Optimal control, 7. Clean energy, Data science, Industrial engineering, Field (computer science), Data-driven, 03 medical and health sciences, Software, Reglerteknik, Process control, business
Abstract: The aim of this article is to present an example of a novel cloud computing infrastructure for big data analytics in the Process Control Industry. Latest innovations in the field of Process Analyzer Techniques (PAT), big data and wireless technologies have created a new environment in which almost all stages of the industrial process can be recorded and utilized, not only for safety, but also for real time optimization. Based on analysis of historical sensor data, machine learning based optimization models can be developed and deployed in real time closed control loops. However, still the local implementation of those systems requires a huge investment in hardware and software, as a direct result of the big data nature of sensors data being recorded continuously. The current technological advancements in cloud computing for big data processing, open new opportunities for the industry, while acting as an enabler for a significant reduction in costs, making the technology available to plants of all sizes. The main contribution of this article stems from the presentation for a fist time ever of a pilot cloud based architecture for the application of a data driven modeling and optimal control configuration for the field of Process Control. As it will be presented, these developments have been carried in close relationship with the process industry and pave a way for a generalized application of the cloud based approaches, towards the future of Industry 4.0 Integrated Process Control based on Distributed In-Situ Sensors into Raw Material and Energy Feedstock, DISIRE
Published: 2017

111. Duodenal Adenocarcinoma: A Rare Cause of Cholangitis

Author: Amir Mari, Yael Kopelman, Dan Feldman, Oren Gal, and Fadi Abu Backer
Subjects: medicine.medical_specialty, education.field_of_study, Bile duct, business.industry, Population, medicine.disease, Major duodenal papilla, medicine.anatomical_structure, Biliary tract, medicine, Duodenum, Duodenal adenocarcinoma, Radiology, Upper gastrointestinal bleeding, Duodenal cancer, business, education
Abstract: Duodenal adenocarcinoma is very rare among the general population. The diagnosis may be delayed until advanced stages, due to the subtle and nonspecific clinical manifestations of that rare pathology. Abdominal pain, upper gastrointestinal bleeding, weight loss and biliary obstruction may be the main patient’s complaints. We present a very interesting case of an old patient with dementia, hospitalized with a clinical, laboratory and imaging state consistent with cholangitis. Conservative therapy with antibiotics and an urgent ERCP was held, during the procedure, the major papilla could not be identified due to distorted anatomy of the second and third parts of the duodenum. Torsion like appearance of the duodenum was observed. Consequentially, the patient biliary tract was drained by inserting an internal– external drain percutaneously. Following the external drainage, a successful gastroscopy was done, with successful exploration of the proximal duodenum, revealing the true cause of the bile duct obstruction; a large pedunculated polypoid mass (approx. 30 mm), in proximate to the major papilla was found as well as the distal pigtailed plastic stent with was inserted as mentioned during angiography. The mass diagnosed as duodenal adenocarcinoma in pathology. This unique case describes presentation of an aggressive rare duodenal cancer, mimicking biliary cholangitis distorting the local anatomy. Endoscopic exploration became feasible due to primary percutaneous drainage.
Published: 2017

112. Position Estimation of Moving Objects: Practical Provable Approximation

Author: Jeryes Danial, Ariel Hutterer, and Dan Feldman
Subjects: 0209 industrial biotechnology, Control and Optimization, Matching (graph theory), Computer science, 0206 medical engineering, Biomedical Engineering, 02 engineering and technology, Set (abstract data type), 020901 industrial engineering & automation, Artificial Intelligence, Position (vector), Point (geometry), Cluster analysis, business.industry, Mechanical Engineering, Approximation algorithm, Tracking system, Point set registration, 020601 biomedical engineering, Computer Science Applications, Human-Computer Interaction, Control and Systems Engineering, Computer Vision and Pattern Recognition, business, Algorithm, Rotation (mathematics)
Abstract: We consider the problem of matching a pair of point sets, each consists of $k$ clusters, where each cluster in the first set is arbitrarily translated with additional noise, resulting in a cluster of the second point set. The goal is to compute $k$ translations and a matching that minimizes the sum of squared distances between corresponding pairs of points. This is a fundamental problem for tracking systems (e.g., OptiTrack or Vicon) where the user registers $k$ objects (rigid bodies) by attaching a set of markers to each object. Based on the position of these markers in real time, the system estimates the position of the moving objects by simultaneously clustering, matching, and transforming the $n$ observed markers to the $k$ objects. Similarly, an autonomous robot equipped with a camera may estimate its position by tracking $n$ visual features from $k$ recognized objects. The result can be used as a seeding clustering for existing algorithms, e.g., to compute the optimal rotation on each cluster. We suggest the first provable algorithm for solving this point matching problem. Unlike common heuristics, it yields a constant factor approximation for the global optimum in expected $O(n^2\log n)$ time. We validate our theoretical results with experimental results using low cost ( $ ) “toy” quadcopters that are safe and lawful for indoor navigation due to their $ g weight. Comparisons to existing algorithms and commercial system are provided, together with open source code and a demonstration video.
Published: 2019

113. Learning Big (Image) Data via Coresets for Dictionaries

Author: Nir Sochen, Micha Feigin, and Dan Feldman
Subjects: Statistics and Probability, K-SVD, Computer science, business.industry, Applied Mathematics, Pattern recognition, Image processing, Condensed Matter Physics, Computational geometry, Image (mathematics), Approximation error, Modeling and Simulation, Geometry and Topology, Computer Vision and Pattern Recognition, Artificial intelligence, Representation (mathematics), Linear combination, Coreset, business
Abstract: Signal and image processing have seen an explosion of interest in the last few years in a new form of signal/image characterization via the concept of sparsity with respect to a dictionary. An active field of research is dictionary learning: the representation of a given large set of vectors (e.g. signals or images) as linear combinations of only few vectors (patterns). To further reduce the size of the representation, the combinations are usually required to be sparse, i.e., each signal is a linear combination of only a small number of patterns. This paper suggests a new computational approach to the problem of dictionary learning, known in computational geometry as coresets. A coreset for dictionary learning is a small smart non-uniform sample from the input signals such that the quality of any given dictionary with respect to the input can be approximated via the coreset. In particular, the optimal dictionary for the input can be approximated by learning the coreset. Since the coreset is small, the learning is faster. Moreover, using merge-and-reduce, the coreset can be constructed for streaming signals that do not fit in memory and can also be computed in parallel. We apply our coresets for dictionary learning of images using the K-SVD algorithm and bound their size and approximation error analytically. Our simulations demonstrate gain factors of up to 60 in computational time with the same, and even better, performance. We also demonstrate our ability to perform computations on larger patches and high-definition images, where the traditional approach breaks down.
Published: 2013

114. Testing for Unit Roots and Cointegration in Spatial Cross-Section Data

Author: Daniel Felsenstein, Michael Beenstock, and Dan Feldman
Subjects: Cointegration, Geography, Planning and Development, Statistics, Monte Carlo method, Mathematical analysis, Earth and Planetary Sciences (miscellaneous), Unit root, Statistics, Probability and Uncertainty, Space (mathematics), General Economics, Econometrics and Finance, Unit (ring theory), Mathematics
Abstract: Spatial impulses are derived for SAR models containing a spatial unit root. Analytical solutions are obtained for lateral space where the number of spatial units tends to infinity. Numerical solutions are obtained for finite regular lattices where edge-effects are shown to influence spatial impulses, and for irregular lattices. Monte Carlo simulation methods are used to compute critical values for spatial unit root tests in SAR models estimated from spatial cross-section data for regular and irregular lattices. We also compute critical SAC values for spatial cointegration tests for cross-section data that happen to be spatially nonstationary. We show that parameter estimates in spatially cointegrated models are ‘superconsistent’. RESUME On derive des impulsions spatiales de modeles SAR contenant une racine unite spatiale. On obtient des solutions analytiques pour l'espace lateral lorsque le nombre d'unites spatiales tend vers l'infini. On obtient des solutions numeriques pour des reseaux regulier...
Published: 2012

115. More Constraints, Smaller Coresets

Author: Dan Feldman and Tamir Tassa
Subjects: Matrix (mathematics), Theoretical computer science, Dimension (vector space), Low-rank approximation, Coreset, Row, Time complexity, Algorithm, Non-negative matrix factorization, Mathematics, Matrix decomposition
Abstract: We suggest a generic data reduction technique with provable guarantees for computing the low rank approximation of a matrix under some $ellz error, and constrained factorizations, such as the Non-negative Matrix Factorization (NMF). Our main algorithm reduces a given n x d matrix into a small, e-dependent, weighted subset C of its rows (known as a coreset), whose size is independent of both n and d. We then prove that applying existing algorithms on the resulting coreset can be turned into (1+e)-approximations for the original (large) input matrix. In particular, we provide the first linear time approximation scheme (LTAS) for the rank-one NMF.The coreset C can be computed in parallel and using only one pass over a possibly unbounded stream of row vectors. In this sense we improve the result in [4] (Best paper of STOC 2013). Moreover, since C is a subset of these rows, its construction time, as well as its sparsity (number of non-zeroes entries) and the sparsity of the resulting low rank approximation depend on the maximum sparsity of an input row, and not on the actual dimension d. In this sense, we improve the result of Libery [21](Best paper of KDD 2013) and answer affirmably, and in a more general setting, his open question of computing such a coreset. Source code is provided for reproducing the experiments and integration with existing and future algorithms.
Published: 2015

116. Coresets for visual summarization with applications to loop closure

Author: Guy Rosman, Mikhail Volkov, John W. Fisher, Dan Feldman, Daniela Rus, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Volkov, Mikhail, Rosman, Guy, Feldman, Dan, Fisher III, John W., and Rus, Daniela L.
Subjects: business.industry, Computer science, Video tracking, Robot, Computer vision, Artificial intelligence, Image segmentation, business, Coreset, Representation (mathematics), Automatic summarization, Visualization, Video compression picture types
Abstract: In continuously operating robotic systems, efficient representation of the previously seen camera feed is crucial. Using a highly efficient compression coreset method, we formulate a new method for hierarchical retrieval of frames from large video streams collected online by a moving robot. We demonstrate how to utilize the resulting structure for efficient loop-closure by a novel sampling approach that is adaptive to the structure of the video. The same structure also allows us to create a highly-effective search tool for large-scale videos, which we demonstrate in this paper. We show the efficiency of proposed approaches for retrieval and loop closure on standard datasets, and on a large-scale video from a mobile camera., Hon Hai/Foxconn International Holdings Ltd., Lincoln Laboratory, Singapore-MIT Alliance for Research and Technology Center (Future Urban Mobility Project), United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant N00014-09-1-1051)
Published: 2015

117. Fleye on the car

Author: Andew Barry, Soliman Nasser, Guy Rosman, Guy Peled, Marek Doniec, Mikhail Volkov, Dan Feldman, and Daniela Rus
Subjects: Quadcopter, Data collection, Computer science, business.industry, Big data, Cloud computing, Collision, Computer security, computer.software_genre, Drone, Traffic congestion, State (computer science), business, computer
Abstract: Vehicle-based vision algorithms, such as the collision alert systems [4], are able to interpret a scene in real-time and provide drivers with immediate feedback. However, such technologies are based on cameras on the car, limited to the vicinity of the car, severely limiting their potential. They cannot find empty parking slots, bypass traffic jams, or warn about dangers outside the car's immediate surrounding. An intelligent driving system augmented with additional sensors and network inputs may significantly reduce the number of accidents, improve traffic congestion, and care for the safety and quality of people's lives. We propose an open-code system, called Fleye, that consists of an autonomous drone (nano quadrotor) that carries a radio camera and flies few meters in front and above the car. The streaming video is transmitted in real time from the quadcopter to Amazon's EC2 cloud together with information about the driver, the drone, and the car's state. The output is then transmitted to the "smart glasses" of the driver. The control of the drone, as well as the sensor data collection from the driver, is done by low cost (
Published: 2015

118. The Describer's Nightmare: Touching Form in Colson Whitehead's John Henry Days.

Author: DAN FELDMAN, EZRA
Subjects: *AGNOSTICISM, *TAXONOMY, *THEORY of knowledge
Abstract: The form and formlessness of histories, regions, races, ballads, fictions, lists, characters, and mountains are among the topics of concern in Colson Whitehead's John Henry Days, and they pose a real challenge to conveying what this novel is like. Caroline Levine's Forms: Whole, Rhythm, Hierarchy, Network advocates considering forms in terms of their affordances, "the potential uses or actions latent in materials and designs" (6). This depends, however, on the reliable identification of forms in a particular text, activity, ormaterial; and the critical response to John Henry Days gives us evidence that, while we can analyze forms the novel deploys and contains, it remains a challenge to identify the novel's form as a whole. In a different vein, Heather Love'swork ondescription is explicitly concernedwith "forms of analysis," but not with the analysis of formper se. The present examination of JohnHenry Days attempts to bridge such valuable conversations about form and description. This article argues that as John Henry Days grapples with describing forms that constantly remake themselves, it takes a position akin to science and technology studies scholar Michael Lynch's theoretical agnosticism with respect to capital-O Ontology. Refusing anything like a full-blowntheory of form, John HenryDays both practices and advocates provisional taxonomy--touching and moving on--as a way of knowing its ever-changingmaterial. This article's analysis of the describer's nightmare is thus a case study for Lynch's claim that "particular descriptions--including descriptions of ontologies--can make sense, apparently even to others who do not share our grand theories". [ABSTRACT FROM AUTHOR]
Published: 2019

119. Long-Distance Hiking

Author: Dan Feldman and Dan Feldman
Abstract: The how-to book for long-distance hikers who want to finish.
Published: 2013

120. Smallest enclosing ball for probabilistic data

Author: Christian Sohler, Alexander Munteanu, and Dan Feldman
Subjects: Combinatorics, Discrete mathematics, TheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY, Probabilistic logic, Probability distribution, Ball (mathematics), Random variable, Time complexity, Mathematics
Abstract: This paper deals with computing the smallest enclosing ball of a set of points subject to probabilistic data. In our setting, any of the n points may not or may occur at one of finitely many locations, following its own discrete probability distribution. The objective is therefore considered to be a random variable and we aim at finding a center minimizing the expected maximum distance to the points according to their distributions. Our main contribution presented in this paper is the first polynomial time (1 + ϵ)-approximation algorithm for the probabilistic smallest enclosing ball problem with extensions to the streaming setting.
Published: 2014

121. K-robots clustering of moving sensors using coresets

Author: Stephanie Gil, Daniela Rus, Brian J. Julian, Dan Feldman, Ross A. Knepper, Lincoln Laboratory, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. School of Engineering, Feldman, Dan, Gil, Stephanie, Knepper, Ross A., Julian, Brian John, and Rus, Daniela L.
Subjects: Set (abstract data type), Polynomial, Optimization problem, Computational complexity theory, Computer science, Server, Distributed computing, Mobile robot, Cluster analysis, Coreset, Algorithm
Abstract: We present an approach to position k servers (e.g. mobile robots) to provide a service to n independently moving clients; for example, in mobile ad-hoc networking applications where inter-agent distances need to be minimized, connectivity constraints exist between servers, and no a priori knowledge of the clients' motion can be assumed. Our primary contribution is an algorithm to compute and maintain a small representative set, called a kinematic coreset, of the n moving clients.We prove that, in any given moment, the maximum distance between the clients and any set of k servers is approximated by the coreset up to a factor of (1 ± ε), where ε > 0 is an arbitrarily small constant. We prove that both the size of our coreset and its update time is polynomial in k log(n)/ε. Although our optimization problem is NP-hard (i.e., takes time exponential in the number of servers to solve), solving it on the small coreset instead of the original clients results in a tractable controller. The approach is validated in a small scale hardware experiment using robot servers and human clients, and in a large scale numerical simulation using thousands of clients., Micro Autonomous Consortium Systems and Technology (United States. Army Research Laboratory (Grant W911NF-08-2-0004)), United States. Air Force (Contract FA8721-05-C-0002)
Published: 2013

122. The single pixel GPS: learning big data signals from tiny coresets

Author: Cynthia Sung, Daniela Rus, Dan Feldman, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. School of Engineering, Feldman, Dan, Sung, Cynthia Rueyi, and Rus, Daniela L.
Subjects: Polynomial, Matching (graph theory), Computer science, Compression (functional analysis), Parallel algorithm, Coreset, Projection (set theory), Cluster analysis, Time complexity, Algorithm
Abstract: We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applications include compression, denoising, activity recognition, road matching, and map generation. We encode these problems as (k, m)-segment mean problems. Formally, we provide (1 + ε)-approximations to the k-segment and (k, m)-segment mean of a d-dimensional discrete-time signal. The k-segment mean is a k-piecewise linear function that minimizes the regression distance to the signal. The (k,m)-segment mean has an additional constraint that the projection of the k segments on R[superscript d] consists of only m ≤ k segments. Existing algorithms for these problems take O(kn[superscript 2]) and n[superscript O(mk)] time respectively and O(kn[superscript 2]) space, where n is the length of the signal. Our main tool is a new coreset for discrete-time signals. The coreset is a smart compression of the input signal that allows computation of a (1 + ε)-approximation to the k-segment or (k,m)-segment mean in O(n log n) time for arbitrary constants ε,k, and m. We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n. We provide empirical evaluations of the quality of our coreset and experimental results that show how our coreset boosts both inefficient optimal algorithms and existing heuristics. We demonstrate our results for extracting signals from GPS traces. However, the results are more general and applicable to other types of sensors., United States. Office of Naval Research (Grant ONR-MURI Award N00014-09-1-1051), United States. Office of Naval Research (Grant ONR-MURI Award N00014-09-1-1031), Singapore-MIT Alliance for Research and Technology, Google (Firm)
Published: 2012

123. Trajectory clustering for motion prediction

Author: Cynthia Sung, Dan Feldman, and Daniela Rus
Subjects: business.industry, Computer science, Approximation algorithm, Mobile robot, Context (language use), Motion (physics), Obstacle avoidance, Trajectory, Computer vision, Motion planning, Artificial intelligence, Interception, business, Hidden Markov model, Cluster analysis
Abstract: We investigate a data-driven approach to robotic path planning and analyze its performance in the context of interception tasks. Trajectories of moving objects often contain repeated patterns of motion, and learning those patterns can yield interception paths that succeed more often. We therefore propose an original trajectory clustering algorithm for extracting motion patterns from trajectory data and demonstrate its effectiveness over the more common clustering approach of using k-means. We use the results to build a Hidden Markov Model of a target's motion and predict movement. Our simulations show that these predictions lead to more effective interception. The results of this work have potential applications in coordination of multi-robot systems, tracking and surveillance tasks, and dynamic obstacle avoidance. © 2012 IEEE.
Published: 2012

124. From High Definition Image to Low Space Optimization

Author: Micha Feigin, Nir Sochen, and Dan Feldman
Subjects: K-SVD, Pixel, Computer science, business.industry, Image processing, Pattern recognition, Sparse approximation, Artificial intelligence, Linear combination, Coreset, business, Field (computer science), Image (mathematics)
Abstract: Signal and image processing have seen in the last few years an explosion of interest in a new form of signal/image characterization via the concept of sparsity with respect to a dictionary. An active field of research is dictionary learning: Given a large amount of example signals/images one would like to learn a dictionary with much fewer atoms than examples on one hand, and much more atoms than pixels on the other hand. The dictionary is constructed such that the examples are sparse on that dictionary i.e each image is a linear combination of small number of atoms. This paper suggests a new computational approach to the problem of dictionary learning. We show that smart non-uniform sampling, via the recently introduced method of coresets, achieves excellent results, with controlled deviation from the optimal dictionary. We represent dictionary learning for sparse representation of images as a geometric problem, and illustrate the coreset technique by using it together with the K---SVD method. Our simulations demonstrate gain factor of up to 60 in computational time with the same, and even better, performance. We also demonstrate our ability to perform computations on larger patches and high-definition images, where the traditional approach breaks down.
Published: 2012

125. A Unified Framework for Approximating and Clustering Data

Author: Michael Langberg and Dan Feldman
Subjects: Discrete mathematics, FOS: Computer and information sciences, Optimization problem, Function (mathematics), Machine Learning (cs.LG), Combinatorics, VC dimension, Computer Science - Learning, Tuple, Coreset, Cluster analysis, Time complexity, Subspace topology, Mathematics
Abstract: Given a set F of n positive functions over a ground set X, we consider the problem of computing x* that minimizes the expression ∑f ∈ Ff(x), over x ∈ X. A typical application is shape fitting, where we wish to approximate a set P of n elements (say, points) by a shape x from a (possibly infinite) family X of shapes. Here, each point p ∈ P corresponds to a function f such that f(x) is the distance from p to x, and we seek a shape x that minimizes the sum of distances from each point in P. In the k-clustering variant, each x\in X is a tuple of k shapes, and f(x) is the distance from p to its closest shape in x.Our main result is a unified framework for constructing coresets and approximate clustering for such general sets of functions. To achieve our results, we forge a link between the classic and well defined notion of e-approximations from the theory of PAC Learning and VC dimension, to the relatively new (and not so consistent) paradigm of coresets, which are some kind of "compressed representation" of the input set F. Using traditional techniques, a coreset usually implies an LTAS (linear time approximation scheme) for the corresponding optimization problem, which can be computed in parallel, via one pass over the data, and using only polylogarithmic space (i.e, in the streaming model). For several function families F for which coresets are known not to exist, or the corresponding (approximate) optimization problems are hard, our framework yields bicriteria approximations, or coresets that are large, but contained in a low-dimensional space.We demonstrate our unified framework by applying it on projective clustering problems. We obtain new coreset constructions and significantly smaller coresets, over the ones that appeared in the literature during the past years, for problems such as: k-Median [Har-Peled and Mazumdar,STOC'04], [Chen, SODA'06], [Langberg and Schulman, SODA'10]; k-Line median [Feldman, Fiat and Sharir, FOCS'06], [Deshpande and Varadarajan, STOC'07]; Projective clustering [Deshpande et al., SODA'06] [Deshpande and Varadarajan, STOC'07]; Linear lp regression [Clarkson, Woodruff, STOC'09 ]; Low-rank approximation [Sarlos, FOCS'06]; Subspace approximation [Shyamalkumar and Varadarajan, SODA'07], [Feldman, Monemizadeh, Sohler and Woodruff, SODA'10], [Deshpande, Tulsiani, and Vishnoi, SODA'11].The running times of the corresponding optimization problems are also significantly improved. We show how to generalize the results of our framework for squared distances (as in k-mean), distances to the qth power, and deterministic constructions.
Published: 2011
Full Text: View/download PDF

126. Private coresets

Author: Amos Fiat, Haim Kaplan, Kobbi Nissim, and Dan Feldman
Subjects: Combinatorics, Set (abstract data type), Strategic dominance, Computer science, Efficient algorithm, Point set, Multiplicative function, Differential privacy, Single point, Coreset
Abstract: A coreset of a point set P is a small weighted set of points that captures some geometric properties of $P$. Coresets have found use in a vast host of geometric settings. We forge a link between coresets, and differentially private sanitizations that can answer any number of queries without compromising privacy. We define the notion of private coresets, which are simultaneously both coresets and differentially private, and show how they may be constructed. We first show that the existence of a small coreset with low generalized sensitivity (i.e., replacing a single point in the original point set slightly affects the quality of the coreset) implies (in an inefficient manner) the existence of a private coreset for the same queries. This greatly extends the works of Blum, Ligett, and Roth [STOC 2008] and McSherry and Talwar [FOCS 2007]. We also give an efficient algorithm to compute private coresets for k-median and k-mean queries in Red, immediately implying efficient differentially private sanitizations for such queries. Following McSherry and Talwar, this construction also gives efficient coalition proof (approximately dominant strategy) mechanisms for location problems. Unlike coresets which only have a multiplicative approximation factor, we prove that private coresets must have an additive error. We present a new technique for showing lower bounds on this error.
Published: 2009

127. Using All Sky Cameras to determine cloud statistics for the Thirty Meter Telescope candidate sites

Author: P. Gillett, Edison Bustos, Brooke Gregory, Warren Skidmore, J. Vasquez, Dan Feldman, Reed Riddle, David Walker, J. Seguel, M. Schöck, Robert Blum, Sebastian Els, Eugene A. Magnier, Tony Travouillon, Stepp, Larry M., and Gilmozzi, Roberto
Subjects: Photometry (optics), Computer science, business.industry, Sky, media_common.quotation_subject, Statistics, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Cloud computing, business, Thirty Meter Telescope, Remote sensing, media_common
Abstract: All Sky Cameras were deployed at all Thirty Meter Telescope (TMT) candidate sites. The images gathered by these cameras were used to assess the cloud statistics for each site. We describe two methods that were developed to do this, a manual method based on inspection of blue and red movies, and an automated method based on photometric analysis of the images.
Published: 2008

128. Bi-criteria linear-time approximations for generalized k-mean/median/center

Author: Micha Sharir, Amos Fiat, Dan Feldman, and Danny Segev
Subjects: Combinatorics, Set (abstract data type), Discrete mathematics, Simple (abstract algebra), k-means clustering, Value (computer science), Approximation algorithm, Center (group theory), Binary logarithm, Time complexity, Mathematics
Abstract: We consider the problem of approximating a set P of n points in Rd by a collection of j-dimensional flats, andextensions thereof, under the standard median / mean / centermeasures, in which we wish to minimize, respectively, the sum of thedistances from each point of P to its nearest flat, the sum of thesquares of these distances, or the maximal such distance.Such problems cannot be approximated unless P=NP but do allowbi-criteria approximations where one allows some leeway in both the numberof flats and the quality of the objective function.We give a very simple bi-criteria approximation algorithm, which producesat most α(k,j,n) = (k j log n)O(j) flats, which exceeds the optimalobjective value for any k j-dimensional flats by a factor of nomore than β(j)= 2O(j). Given this bi-criteria approximation, wecan use it to reduce the approximation factor arbitrarily, at the costof increasing the number of flats. Our algorithm hasmany advantages over previous work, in that it is muchmore widely applicable (wider set of objective functions and classes ofclusters) and much more efficient -- reducing the running time bound from O(n Poly(k,j)) to nd · (jk)O(j).Our algorithm is randomized and successful with probability 1/2(easily boosted to probabilities arbitrary close to 1).
Published: 2007

129. A PTAS for k-means clustering based on weak coresets

Author: Morteza Monemizadeh, Dan Feldman, and Christian Sohler
Subjects: Combinatorics, Discrete mathematics, Euclidean space, Point set, k-means clustering, Partition (number theory), Coreset, Cluster analysis, Mathematics, Running time
Abstract: Given a point set P ⊆ Rd the k-means clustering problem is to find a set C=(c1,...,ck) of k points and a partition of P into k clusters C1,...,Ck such that the sum of squared errors ∑i=1k ∑p ∈ Ci |p -ci |22 is minimized. For given centers this cost function is minimized byassigning points to the nearest center.The k-means cost function is probably the most widely used cost function in the area of clustering.In this paper we show that every unweighted point set P has a weak (e, k)-coreset of size Poly(k,1/e) for the k-means clustering problem, i.e. its size is independent of the cardinality |P| of the point set and the dimension d of the Euclidean space Rd. A weak coreset is a weighted set S ⊆ P together with a set T such that T contains a (1+e)-approximation for the optimal cluster centers from P and for every set of kcenters from T the cost of the centers for S is a (1±e)-approximation of the cost for P.We apply our weak coreset to obtain a PTAS for the k-means clustering problem with running time O(nkd + d · Poly(k/e) + 2O(k/e)).
Published: 2007

130. Criticism after Critique: Aesthetics, Literature, and the Political.

Author: Dan Feldman, Ezra
Subjects: *CRITICISM (Philosophy), *LITERARY criticism, *NONFICTION
Published: 2016

131. An Empirical Template Library of Stellar Spectra for a Wide Range of Spectral Classes, Luminosity Classes, and Metallicities Using SDSS BOSS Spectra.

Author: Aurora Y. Kesseli, Andrew A. West, Mark Veyette, Brandon Harrison, Dan Feldman, and John J. Bochanski
Published: 2017
Full Text: View/download PDF

132. LIZARD.

Author: DAN FELDMAN, EZRA
Subjects: LIZARD (Poem), FELDMAN, Ezra Dan
Published: 2017

133. BLUE COHOSH.

Author: DAN FELDMAN, EZRA
Subjects: BLUE Cohosh (Poem), FELDMAN, Ezra Dan
Published: 2017

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

133 results on '"Dan Feldman"'

101. k-Means+++: Outliers-Resistant Clustering

102. Real-time EEG classification via coresets for BCI applications

103. Magnesium Deficiency and Minimal Hepatic Encephalopathy among Patients with Compensated Liver Cirrhosis

104. Finding Patterns in Signals Using Lossy Text Compression

105. Aligning Points to Lines: Provable Approximations

106. Secure $k$-ish Nearest Neighbors Classifier

107. iDiary

108. HEDONIC PRICING WHEN HOUSING IS ENDOGENOUS: THE VALUE OF ACCESS TO THE TRANS-ISRAEL HIGHWAY*

109. Automatic alignment of geographic features in contemporary vector data and historical maps

110. Cloud computing for big data analytics in the Process Control Industry

111. Duodenal Adenocarcinoma: A Rare Cause of Cholangitis

112. Position Estimation of Moving Objects: Practical Provable Approximation

113. Learning Big (Image) Data via Coresets for Dictionaries

114. Testing for Unit Roots and Cointegration in Spatial Cross-Section Data

115. More Constraints, Smaller Coresets

116. Coresets for visual summarization with applications to loop closure

117. Fleye on the car

118. The Describer's Nightmare: Touching Form in Colson Whitehead's John Henry Days.

119. Long-Distance Hiking

120. Smallest enclosing ball for probabilistic data

121. K-robots clustering of moving sensors using coresets

122. The single pixel GPS: learning big data signals from tiny coresets

123. Trajectory clustering for motion prediction

124. From High Definition Image to Low Space Optimization

125. A Unified Framework for Approximating and Clustering Data

126. Private coresets

127. Using All Sky Cameras to determine cloud statistics for the Thirty Meter Telescope candidate sites

128. Bi-criteria linear-time approximations for generalized k-mean/median/center

129. A PTAS for k-means clustering based on weak coresets

130. Criticism after Critique: Aesthetics, Literature, and the Political.

131. An Empirical Template Library of Stellar Spectra for a Wide Range of Spectral Classes, Luminosity Classes, and Metallicities Using SDSS BOSS Spectra.

132. LIZARD.

133. BLUE COHOSH.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

133 results on '"Dan Feldman"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources