Author: "Benczúr, András" / Publication Type: Magazines - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Benczúr, András"' showing total 9 results

Start Over Author "Benczúr, András" Publication Type Magazines

9 results on '"Benczúr, András"'

1. The applicability of on-line contextual calibration to a neural network based monocular collision avoidance system on a UAV⁎⁎Research is supported by 2018-1.2.1-NKP-00008: Exploring the Mathematical Foundations of Artificial Intelligence and by MTA Premium Postdoctoral Grant 2018. This paper was also supported by the Janos Bolyai Research Scholarship of the Hungarian Academy of Sciences and the Higher Education Institutional Excellence Program.

Author: Hiba, Antal, Aleksziev, Rita, Pázmán, Koppány, Bauer, Péter, Benczúr, András, Zarándy, Ákos, and Daróczy, Bálint
Abstract: Contextual calibration for object detection is a technique where a pretrained network collects attractive false positives during a calibration phase and use this calibration data for further training. This paper investigates the applicability of this method to a vision based onboard sense and avoid system, which requires intruder aircraft detection in camera images. Various landscape and sky backgrounds were generated by Unreal4 3D engine for calibration tests. Contextual calibration is a promising candidate for handling extreme situations which are not covered well in the training data.
Published: 2019
Full Text: View/download PDF

2. Robust Decentralized Low-Rank Matrix Decomposition

Author: Hegedűs, István, Berta, Árpád, Kocsis, Levente, Benczúr, András A., and Jelasity, Márk
Abstract: Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated originates from a large distributed system, such as a network of mobile phones or smart meters, a challenging problem arises due to the strongly conflicting yet essential requirements of efficiency, robustness, and privacy preservation. We argue that although collecting sensitive data in a centralized fashion may be efficient, it is not an option when considering privacy and efficiency at the same time. Thus, we do not allow any sensitive data to leave the nodes of the network. The local information at each node (personal attributes, documents, media ratings, etc.) defines one row in the matrix. This means that all computations have to be performed at the edge of the network. Known parallel methods that respect the locality constraint, such as synchronized parallel gradient search or distributed iterative methods, require synchronized rounds or have inherent issues with load balancing, and thus they are not robust to failure. Our distributed stochastic gradient descent algorithm overcomes these limitations. During the execution, any sensitive information remains local, whereas the global features (e.g., the factor model of movies) converge to the correct value at all nodes. We present a theoretical derivation and a thorough experimental evaluation of our algorithm. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme and realistic failure scenarios. To demonstrate the feasibility of our approach, we present trace-based simulations, real smartphone user behavior analysis, and tests over real movie recommender system data.
Published: 2016
Full Text: View/download PDF

3. Towards a Normal Form and a Query Language for Extended Relations Defined by Regular Expressions

Author: Benczúr, András and Szabó, Gyula
Abstract: This paper introduces a generalized data base concept that unites relational and semi structured data models. As an important theoretical result we could find a quadratic decision algorithm for the implication problem of functional and join dependencies defined on the united data model. As practical contribution we presented a normal form for the new data model as a tool for data base design. With our novel representations of regular expressions, a more effective searching method could be developed. XML elements are described by XML schema languages such as a DTD or an XML Schema definition. The instances of these elements are semi-structured tuples. A semi-structured tuple is an ordered list of (attribute: value) pairs. We may think of a semi-structured tuple as a sentence of a formal language, where the values are the terminal symbols and the attribute names are the non-terminal symbols. In the authors' former work (Szabó and Benczúr, 2015) they introduced the notion of the extended tuple as a sentence from a regular language generated by a grammar where the non-terminal symbols of the grammar are the attribute names of the tuple. Sets of extended tuples are the extended relations. The authors then introduced the dual language, which generates the tuple types allowed to occur in extended relations. They defined functional dependencies (regular FD - RFD) over extended relations. In this paper they rephrase the RFD concept by directly using regular expressions over attribute names to define extended tuples. By the help of a special vertex labeled graph associated to regular expressions the specification of substring selection for the projection operation can be defined. The normalization for regular schemas is more complex than it is in the relational model, because the schema of an extended relation can contain an infinite number of tuple types. However, the authors can define selection, projection and join operations on extended relations too, so a lossless-join decomposition can be performed. They extended their previous model to deal with XML schema indicators too, e.g., with numerical constraints. They added line and set constructors too, in order to extend their model with more general projection and selection operators. This model establishes a query language with table join functionality for collected XML element data.
Published: 2016
Full Text: View/download PDF

4. Temporal influence over the Last.fm social network

Author: Pálovics, Róbert and Benczúr, András
Abstract: In a previous result, we showed that the influence of social contacts spreads information about new artists through the Last.fm social network. We successfully decomposed influence from effects of trends, global popularity, and homophily or shared environment of friends. In this paper, we present our new experiments that use a mathematically sound formula for defining and measuring the influence in the network. We provide new baseline and influence models and evaluation measures, both batch and online, for real-time recommendations with very strong temporal aspects. Our experiments are carried over the 2-year “scrobble” history of 70,000 Last.fm users. In our results, we formally define and distil the effect of social influence. In addition, we provide new models and evaluation measures for real-time recommendations with very strong temporal aspects.
Published: 2015
Full Text: View/download PDF

5. Modeling information systems from the viewpoint of active documents

Author: Molnár, Bálint and Benczúr, András
Abstract: The development of document handling by organizations at the level of business processes and business information systems leads to the phenomenon that the majority of documents and their contents remain in semi-structured format and definite minorities of documents are directly mapped onto structured databases. The rapidly evolving technology on the database field provides the opportunity to manage directly the semi-structured documents in line with the requirements of business processes. A continuum of possible document formats may exist in business environments. The documents can be categorized by the organization of the underlying data collections and according to the necessity and capability of data included in documents as whether wholly or partially is to be structured. The most modern database technology yields tools for handling and retrieving data making use of semi-structured and unstructured data. Our proposed approach (1) on one hand provides a theoretical frameworkfor modeling IS to shape into some structure, (2) on the other hand yields guidance for design methodto employ it for practical application with the extension of specific elementary models. The proposed modeling method that places the emphasis onto the documentsand their symbiotic life with processeshelps in understanding the behavior of most modern information systems. As a result, the combination of document-centric modeling and the enterprise architecture approach gives an opportunity for a unified modelingapproach that keeps an eye on Conway’s thesis that states that software architecture congruent with the structure of development team, and this statement may be paraphrased that the overall document structure may reflect the structure of the specific organization.
Published: 2015
Full Text: View/download PDF

6. The Classification Power of Web Features

Author: Erdélyi, Miklós, Benczúr, András A., Daróczy, Bálint, Garzó, András, Kiss, Tamás, and Siklósi, Dávid
Abstract: AbstractIn this article we give a comprehensive overview of features devised for web spam detection and investigate how much various classes, some requiring very high computational effort, add to the classification accuracy.We collect and handle a large number of features based on recent advances in web spam filtering, including temporal ones; in particular, we analyze the strength and sensitivity of linkage change.We propose new, temporal link-similarity-based features and show how to compute them efficiently on large graphs.We show that machine learning techniques, including ensemble selection, LogitBoost, and random forest significantly improve accuracy.We conclude that, with appropriate learning techniques, a simple and computationally inexpensive feature subset outperforms all previous results published so far on our dataset and can be further improved only slightly by computationally expensive features.We test our method on three major publicly available datasets: the Web Spam Challenge 2008 dataset WEBSPAM-UK2007, the ECML/PKDD Discovery Challenge dataset DC2010, and the Waterloo Spam Rankings for ClueWeb09.Our classifier ensemble sets the strongest classification benchmark compared to participants of the Web Spam and ECML/PKDD Discovery Challenges as well as the TREC Web track.To foster research in the area, we make several feature sets and source codes public,1https://datamining.sztaki.hu/en/download/web-spam-resourcesincluding the temporal features of eight .ukcrawl snapshots that include WEBSPAM-UK2007 as well as the Web Spam Challenge features for the labeled part of ClueWeb09.
Published: 2014
Full Text: View/download PDF

7. Facet of Modeling Web Information Systems from a Document-Centric View

Author: Molnár, Bálint and Benczúr, András
Abstract: The modeling of Information Systems in general, and Web Information Systems (WIS) especially, is a permanent issue so that there have been already several attempts and proposals for representing various facets of WIS. In our proposed approach, we focus on the organizational and business activity modeling and we concentrate on documents that represent the information of enterprises in the form of unstructured and semi-structured documents. The compilation of documents mirrors implicitly or explicitly the structure of enterprises, the interrelationship of business processes, and activities and tasks within processes. The documents represent, at the same time, the system roles along with tasks and activities. Our modeling approach concentrates on the co-existence and co-operation of documents and activities of business. The Story Algebra, or more generally the process algebra approach provides a formal framework that promises a formal describing method for modeling precisely the event triggered processes coupled with data in document format within an Enterprise Architecture Framework.
Published: 2013
Full Text: View/download PDF

8. KDD Cup 2007 task 1 winner report

Author: Kurucz, Miklós, Benczúr, András, Kiss, Tamás, Nagy, István, Szabó, Adrienn, and Torma, Balázs
Abstract: KDD Cup 2007 focuses on predicting aspects of movie rating behavior. We present our prediction method for Task 1 "Who Rated What in 2006" where the goal is to predict which users rated which movies in 2006. We use the combination of the following methods, listed in the order of their accuracy: • The predicted number of ratings for each movie based on time series analysis, also using movie and DVD release dates and movie series detection by the edit distance of the titles. • The predicted number of ratings by each user by using the fact that ratings were sampled proportional to the margin. • The low rank approximation of the 0-1 matrix of known user-movie pairs with rating. • Prediction by using the movie-movie similarity matrix. • Association rules obtained by frequent sequence mining of user ratings considered as ordered itemsets. By combining the predictions by linear regression we obtained a prediction with root mean squared error 0.256. The first runner up result was 0.263 while a pure all zeroes prediction already gives 0.279, indicating the hardness of the task.
Published: 2007
Full Text: View/download PDF

9. Augmenting Undirected Edge Connectivity in Õ(n2) Time

Author: Benczúr, András A and Karger, David R
Abstract: We give improved randomized (Monte Carlo) algorithms for undirected edge splitting and edge connectivity augmentation problems. Our algorithms run in time Õ(n2) on n-vertex graphs, making them an Ω̃(m/n) factor faster than the best known deterministic ones on m-edge graphs.
Published: 2000
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Benczúr, András"'

2. Robust Decentralized Low-Rank Matrix Decomposition

3. Towards a Normal Form and a Query Language for Extended Relations Defined by Regular Expressions

4. Temporal influence over the Last.fm social network

5. Modeling information systems from the viewpoint of active documents

6. The Classification Power of Web Features

7. Facet of Modeling Web Information Systems from a Document-Centric View

8. KDD Cup 2007 task 1 winner report

9. Augmenting Undirected Edge Connectivity in Õ(n2) Time

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Journal

Database

9 results on '"Benczúr, András"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources