Author: "Ortiz-Arroyo, Daniel" / Topic: database management - Searchworks@Jio Institute Digital Library Search Results

1. Exploring the Application of Fuzzy Logic and Data Fusion Mechanisms in QAS.

Author: Carbonell, Jaime G., Siekmann, Jörg, Masulli, Francesco, Mitra, Sushmita, Pasi, Gabriella, Ortiz-Arroyo, Daniel, and Christensen, Hans Ulrich
Abstract: In this paper we explore the application of fuzzy logic and data fusion techniques to improve the performance of passage retrieval in open domain Question Answering Systems (QAS). Our experiments show that our proposed mechanisms provide significant performance improvements when compared to other similar systems. [ABSTRACT FROM AUTHOR]
Published: 2007
Full Text: View/download PDF

2. FuzzyPR: An Effective Passage Retrieval System for QAS.

Author: Carbonell, Jaime G., Siekmann, Jörg, An, Aijun, Stefanowski, Jerzy, Ramanna, Sheela, Butz, Cory J., Pedrycz, Witold, Wang, Guoyin, Christensen, Hans Ulrich, and Ortiz-Arroyo, Daniel
Abstract: In this paper we present FuzzyPR, a novel fuzzy logic based passage retrieval system for Question Answering Systems (QAS). FuzzyPR employs a fuzzy logic based similarity measure that includes the best performing models to implement the question reformulation intuition. Our experiments show that FuzzyPR achieves consistently better performance in terms of coverage than JIRS on the TREC corpora and slightly better on the CLEF corpora. [ABSTRACT FROM AUTHOR]
Published: 2007
Full Text: View/download PDF

3. Annotating Documents by Their Intended Meaning to Make Them Self Explaining: An Essential Progress for the Semantic Web.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Blanchon, Hervé, and Boitet, Christian
Abstract: A Self-Explaining Document (SED) is a document enriched with annotations keeping track of all possible interpretations with respect to a given grammar and dictionary, as well as disambiguating choices. If disambiguation is complete and has been done by the author himself, a SED conveys "the author's intention". The availability of SEDs might considerably reduce misunderstanding between authors and readers, and perhaps lead to the assignment of a "meaning certification level" to any part of a document. We present ways to integrate these annotations into an arbitrary XML document (SED-XML), and to make them visible and usable to readers for accessing the "true content" of a document. We also show that, under several constraints, a SED, once translated into a target language L, might be transformed into an SED in L with no human interaction. Hence, the SED structure might be used in multilingual as well as in monolingual contexts, without addition of human work. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

4. Ontology-Based Application Server to the Execution of Imperative Natural Language Requests.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Linhalis, Flávia, and Abreu Moreira, Dilvan
Abstract: This paper is about using ontologies to help the execution of imperative requests expressed in natural language. In order to achieve this goal, we developed the prototype of an Ontology-Based Application Server to the execution of Natural Language requests (NL-OBAS). The NL-OBAS provides services to allow users to describe requests in several natural languages and uses software components to execute them. One of the advantages of our approach is that natural language is first converted to an interlingua, UNL (Universal Networking Language). The interlingua allows the use of different human languages to express the requests (other systems are restricted to English). The semantics of the interlingua, enhanced by ontologies, is used to retrieve the appropriated software components to compose a dynamic service to execute the requests expressed in natural language. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

5. An XML Framework for a Basque Question Answering System.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Ansa, Olatz, Arregi, Xabier, Otegi, Arantxa, and Valverde, Andoni
Abstract: This paper presents a general platform for a Basque monolingual question answering (QA) system. It focuses on the architecture of the platform, paying special attention to: 1) the integration of the development and evaluation environments, and 2) the systematic use of XML declarative files to control the execution of the modules and the communication between them. Moreover, a first pilot experiment is discussed. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

6. A Hybrid Approach for Relation Extraction Aimed at the Semantic Web.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Specia, Lucia, and Motta, Enrico
Abstract: We present an approach for relation extraction from texts aimed to enrich the semantic annotations produced by a semantic web portal. The approach exploits linguistic and empirical strategies, by means of a pipeline method involving processes such as a parser, part-of-speech tagger, named entity recognition system, pattern-based classification and word sense disambiguation models, and resources such as an ontology, knowledge base and lexical databases. With the use of knowledge intensive strategies to process the input data and corpus-based techniques to deal both with unpredicted cases and ambiguity problems, we expect to accurately discover most of the relevant relations for known and new entities, in an automated way. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

7. Similarity Between Multi-valued Thesaurus Attributes: Theory and Application in Multimedia Systems.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Matthé, Tom, Caluwe, Rita, Tré, Guy, Hallez, Axel, Verstraete, Jörg, Leman, Marc, Cornelis, Olmo, Moelants, Dirk, and Gansemans, Jos
Abstract: In this paper, the theoretical aspects of calculating the similarity between sets, and its generalizations multisets, fuzzy sets and fuzzy multisets, is presented. Afterwards, this theory is applied to enhance the facilities for accessing a multimedia system, namely when searching for correspondence between multi-valued attributes, which are coupled with a thesaurus. Furthermore, to allow flexibility in this search, thesauri with similarities defined between the thesaurus terms are considered. As a possible application, the DEKKMMA project is introduced, a project about an audio archive of African music. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

8. Flexible Intensional Query-Answering for RDF Peer-to-Peer Systems.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, and Majkić, Zoran
Abstract: We consider the Peer-To-Peer (P2P) database systems with RDF ontologies and with the semantic characterization of P2P mappings based on logical views over local peer's ontology. Such kind of virtual-predicate based mappings needs an embedding of RDF ontologies into a predicate first-order logic, or at some of its sublanguages as, for example, logic programs for deductive databases. We consider a peer as a local epistemic logic system with its own belief based on RDF tuples, independent from other peers and their own beliefs. This motivates the need of a semantic characterization of P2P mappings based not on the extension but on the meaning of concepts used in the mappings, that is, based on intensional logic. We show that it adequately models robust weakly-coupled framework of RDF ontologies and supports decidable query answering for the union of conjunctive queries. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

9. Fuzzy Ontologies for the Semantic Web.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Sanchez, Elie, and Yamanoi, Takahiro
Abstract: It is presented several connections between Fuzzy Logic, the Semantic Web, and its components (Ontologies, Description Logics). It is then introduced and illustrated by an example ("Ontology of Art") a Fuzzy Ontology structure, Lexicon and Knowledge Base. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

10. Three-Dimensional Representation of Conceptual Fuzzy Relations.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Olivas, Jose A., and Rios, Samuel
Abstract: In this work, T-DiCoR is presented (Three Dimensional Conceptual Representation) as a tool for representing the fuzzy relations among the most representative concepts of a domain. Using this tool in a Metasearcher, the user may observe what other concepts are related to the searched concept, and what the connection forces are (fuzzy relations between concepts). This knowledge can be useful for making new queries with words conceptually related in a specific domain with the original ones. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

11. The Flow Control of Audio Data Using Distributed Terminal Mixing in Multi-point Communication.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Kim, Young-Mi, and Hwang, Dae-Joon
Abstract: This paper describes an efficient audio flow control method in the point of quantitative performance using audio-mixing, compared to existing P2P(Peer To Peer) method. In comparison with existing P2P method, using central mixing and distributed terminal mixing method, we achieved advance at the point of global network usage and each terminal's CPU load, and additionally we expect more session, more terminal can be served by same amount of network bandwidth and computers. By using P2P method in audio communication, speaker and listener must connect to each other. So it has the critical defect that as the participants grows more and more, the network bandwidth usage, each terminal's CPU load will grows rapidly. So the number of participants in same session will be extremely restricted. In comparison with P2P method, the central mixing method has the great advantage at the points of network usage and terminals CPU load. Regardless of the number of speakers and listeners, all the participants can speak and listen with all other participants by using just one stream's amount of data size and CPU load. But all the network usages and CPU loads of "Audio decompression->Buffering->Mixing->Audio Compression" are concentrated on central server. So the number of sessions and terminals can be participated in one server will be highly restricted. This study solves the problems of server's CPU load and network load by using the distributed terminal mixing method. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

12. Analysis and Validation of Information Access Through Mono, Multidimensional and Dynamic Taxonomies.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, and Sacco, Giovanni Maria
Abstract: Access to complex information bases through multidimensional, dynamic taxonomies (also improperly known as faceted classification systems) is rapidly becoming pervasive in industry, especially in e-commerce. In this paper, the major shortcomings of conventional, monodimensional taxonomic approaches, such as the independence of different branches of the taxonomy and insufficient scalability, are discussed. The dynamic taxonomy approach, the first and most complete model for multidimensional taxonomic access to date, is reviewed and compared to conventional taxonomies. We analyze the reducing power of dynamic taxonomies and conventional taxonomies and report experimental results on real data, which confirm that monodimensional taxonomies are not useful for browsing/retrieval on large databases, whereas dynamic taxonomies can effectively manage very large databases and exhibit a very fast convergence. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

13. Question Answering with Imperfect Temporal Information.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Schockaert, Steven, Ahn, David, Cock, Martine, and Kerre, Etienne E.
Abstract: A temporal question answering system must be able to deduce which qualitative temporal relation holds between two events, a reasoning task that is complicated by the fact that historical events tend to have a gradual beginning and ending. In this paper, we introduce an algebra of temporal relations that is well-suited to represent the qualitative temporal information we have at our disposal. We provide a practical algorithm for deducing new temporal knowledge, and show how this can be used to answer questions that require several pieces of qualitative and quantitative temporal information to be combined. Finally, we propose a heuristic technique to cope with inconsistencies that may arise when integrating qualitative and quantitative information. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

14. Using Knowledge Representation Languages for Video Annotation and Retrieval.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Bertini, M., D'Amico, G., Bimbo, A., and Torniai, C.
Abstract: Effective usage of multimedia digital libraries has to deal with the problem of building efficient content annotation and retrieval tools. In particular in video domain, different techniques for manual and automatic annotation and retrieval have been proposed. Despite the existence of well-defined and extensive standards for video content description, such as MPEG-7, these languages are not explicitly designed for automatic annotation and retrieval purpose. Usage of linguistic ontologies for video annotation and retrieval is a common practice to classify video elements by establishing relationships between video contents and linguistic terms that specify domain concepts at different abstraction levels. The main issue related to the use of description languages such as MPEG-7 or linguistic ontologies is due to the fact that linguistic terms are appropriate to distinguish event and object categories but they are inadequate when they must describe specific or complex patterns of events or video entities. In this paper we propose the usage of knowledge representation languages to define ontologies enriched with visual information that can be used effectively for video annotation and retrieval. Difference between content description languages and knowledge representation languages are shown, the advantages of using enriched ontologies both for the annotation and the retrieval process are presented in terms of enhanced user experience in browsing and querying video digital libraries. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

15. Evaluating the Effectiveness of a Knowledge Representation Based on Ontology in Ontoweb System.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Bueno, Tania C. D., Bedin, Sonali, Cancellier, Fabricia, and Hoeschl, Hugo C.
Abstract: In the past few years, several studies have emphasized the use of ontologies as an alternative to information organization. The notion of ontology has become popular in fields such as intelligent information integration, information retrieval on the Internet, and knowledge management. Different groups use different approaches to develop and verify de effectiveness of ontologies [1] [2] [3]. This diversity can be a factor that makes it difficult the formularization of formal methodologies of evaluation. This paper seeks to provide a way to identify the effectiveness of the knowledge representation based on ontology that was developed trough Knowledge Based System tools. The reason for that is because all processing and storage of gathered information and knowledge base organization is done using this structure. Our evaluation is based on case studies in the Ontoweb system [4], involving real world ontology for money laundry domain. Our results indicate that modification of ontology structure can effectively reveal faults, as long as they adversely affect the program state. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

16. Enhancing Short Text Retrieval in Databases.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Marín, N., Martín-Bautista, M. J., Prados, M., and Vila, M. A.
Abstract: In this paper, we present a mechanism to deal with short text structures in relational databases. Text fields are transformed into a special knowledge representation named AP-structure based on the Apriori algorithm of the mining area. Once the abstract data type is obtained, the text fields can be summarized, mined, and queried in a easy way. The operations to query these fields are the main aim of this paper. Keywords: Semantic querying, short texts, AP-sets, frequent itemsets, knowledge structure. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

17. Face Detection Using Sketch Operators and Vertical Symmetry.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Hyun Joo So, Mi Hye Kim, Yun Su Chung, and Nam Chul Kim
Abstract: In this paper, we propose an algorithm for detecting a face in a target image using sketch operators and vertical facial symmetry (VFS). The former are operators which effectively reflect perceptual characteristics of human visual system to compute sketchiness of pixels and the latter means the bilateral symmetry which a face shows about its central longitudinal axis. In the proposed algorithm, horizontal and vertical sketch images are first obtained from a target image by using a directional BDIP (block difference inverse probabilities) operator which is modified from the BDIP operator. The pair of sketch images is next transformed into a generalized symmetry magnitude (GSM) image by the generalized symmetry transform (GST). From the GSM image, face candidates are then extracted which are quadrangular regions enclosing the triangles that satisfy eyes-mouth triangle (EMT) conditions and VFS. The sketch image for each candidate is obtained by the BDIP operator and classified into a face or nonface by the Bayesian classifier. Among the face candidates classified into faces, one with the largest VFS becomes the output where the EMT gives the location of two eyes and a mouth of a target face. If the procedure detects no face, then it is executed again after illumination compensation on the target image. Experimental results for 1,000 320x240 target images of various backgrounds and circumstances show that the proposed method yields about 97% detection rate and takes a time less than 0.25 second per target image. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

18. Data Stream Synopsis Using SaintEtiQ.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Pham, Quang-Khai, Mouaddib, Noureddine, and Raschia, Guillaume
Abstract: In this paper, a novel approach for building synopses is proposed by using a service and message-oriented architecture. The SaintEtiQ summarization system initially designed for very large stored databases, by its intrinsic features, is capable of dealing with the requirements inherent to the data stream environment. Its incremental maintenance of the output summaries and its scalability allows it to be a serious challenger to existing techniques. The resulting summaries present on the one hand the incoming data in a less precise form but is still on the other hand very informative on the actual content. We expose a novel way of exploiting this semantically rich information for query answering with an approach mid-way between blunt query answering and mid-way between data mining. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

19. Query Phrase Suggestion from Topically Tagged Session Logs.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Jensen, Eric C., Beitzel, Steven M., Chowdhury, Abdur, and Frieder, Ophir
Abstract: Searchers' difficulty in formulating effective queries for their information needs is well known. Analysis of search session logs shows that users often pose short, vague queries and then struggle with revising them. Interactive query expansion (users selecting terms to add to their queries) dramatically improves effectiveness and satisfaction. Suggesting relevant candidate expansion terms based on the initial query enables users to satisfy their information needs faster. We find that suggesting query phrases other users have found it necessary to add for a given query (mined from session logs) dramatically improves the quality of suggestions over simply using cooccurrence. However, this exacerbates the sparseness problem faced when mining short queries that lack features. To mitigate this, we tag query phrases with higher level topical categories to mine more general rules, finding that this enables us to make suggestions for approximately 10% more queries while maintaining an acceptable false positive rate. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

20. Navigating Multimodal Meeting Recordings with the Meeting Miner.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Bouamrane, Matt-Mouley, and Luz, Saturnino
Abstract: We present Meeting Miner, a multimodal meeting browser for navigating recordings of online text and speech collaborative meetings. Meetings are recorded through a collaborative writing environment specially designed to capture participants activities. This information, usually lost in common recordings of multimodal meetings, offers novel possibilities for indexing, navigation and information retrieval in archived meetings. Meeting Miner uses temporal information from the logs of actions captured on self-contained information items (paragraphs of text) to uncover potential information links between these semantic data units. A novel space-based action navigation scheme is presented. Keywords and topic search as well as more advanced queries can be performed by the system. We illustrate the system navigation modalities with several browsing examples. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

21. Discrimination-Based Criteria for the Evaluation of Classifiers.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Dang, Thanh Ha, Marsala, Christophe, Bouchon-Meunier, Bernadette, and Boucher, Alain
Abstract: Evaluating the performance of classifiers is a difficult task in machine learning. Many criteria have been proposed and used in such a process. Each criterion measures some facets of classifiers. However, none is good enough for all cases. In this communication, we justify the use of discrimination measures for evaluating classifiers. The justification is mainly based on a hierarchical model for discrimination measures, which was introduced and used in the induction of decision trees. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

22. Structural and Semantic Modeling of Audio for Content-Based Querying and Browsing.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Sert, Mustafa, Baykal, Buyurman, and Yazıcı, Adnan
Abstract: A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as the underlying features in order to improve the content-based retrieval accuracy, since both features have some advantages for distinct types of audio (e.g., music and speech). The proposed system provides a wide range of opportunities to query and browse an audio data by content, such as querying and browsing for a chorus section, sound effects, and query-by-example. In addition, the clients can express their queries in the form of point, range, and k-nearest neighbor, which are particularly significant in the multimedia domain. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

23. On Semantically-Augmented XML-Based P2P Information Systems.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, and Cuzzocrea, Alfredo
Abstract: Knowledge representation and extraction techniques can be efficiently used to improve data modeling and IR functionalities of P2P Information Systems, which have recently attracted a lot of attention from industrial and academic researchers. These functionalities can be achieved by pushing semantics in both data and queries, and exploiting the derived expressiveness to improve file sharing primitives and lookup mechanisms made available from first-generation P2P systems. XML-based P2P Information Systems are a more specific and interesting instance of this class of systems, where the overall data domain is composed by very large, Internet-like distributed XML repositories from which users extract useful knowledge manly by means of IR methodologies implemented on the top of XML join queries. This paper focuses on several aspects of XML-based P2P Information Systems, raging from foundations and definitions to knowledge representation and extraction models and algorithms, along with their experimental evaluation. However, the results presented in this paper can also be adapted to deal with any kind of data format (e.g., HTML). [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

24. Information Theoretic Approach to Information Extraction.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, and Amati, Giambattista
Abstract: We use the hypergeometric distribution to extract relevant information from documents. The hypergeometric distribution gives the probability estimate of observing a given term frequency with respect to a prior. The lower the probability the higher the amount of information is carried by the term. Given a subset of documents, the information items are weighted by using the inversely related function of of the hypergeometric distribution. We here provide an exemplifying introduction to a topic-driven information extraction from a document collection based on the hypergeometric distribution. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

25. UNL as a Text Content Representation Language for Information Extraction.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Cardeñosa, Jesús, Gallardo, Carolina, and Iraola, Luis
Abstract: This paper describes a new approach for describing contents through the use of interlinguas in order to facilitate the extraction of specific pieces of information. The authors highlight the different dimensions of a document and how these dimensions define the capacities of their respective contents to be found in the scalable process of finding information. A specific interlingua, UNL, will be described. This approach is illustrated both with rich examples of the followed model and with actual applications, that includes the description of some running projects based on the interlingual representation of contents. Keywords: Textual contents representation, Interlinguas, UNL. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

26. Multi-module Image Classification System.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Kim, Wonil, Sangyoon Oh, Sanggil Kang, and Dongkyun Kim
Abstract: In this paper, we propose an image classification system employing multiple modules. The proposed system hierarchically categorizes given sports images into one of the predefined sports classes, eight in this experiment. The image first categorized into one of the two classes in the global module. The corresponding local module is selected accordingly, and then used in the local classification step. By employing multiple modules, the system can specialize each local module properly for the given class feature. The simulation results show that the proposed system successfully classifies images with the correct rate of over 70%. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

27. Cooperative Discovery of Interesting Action Rules.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Dardzińska, Agnieszka, and Raś, Zbigniew W.
Abstract: Action rules introduced in [12] and extended further to e-action rules [21 have been investigated in [22], [13], [20]. They assume that attributes in a database are divided into two groups: stable and flexible. In general, an action rule can be constructed from two rules extracted earlier from the same database. Furthermore, we assume that these two rules describe two different decision classes and our goal is to re-classify objects from one of these classes into the other one. Flexible attributes are essential in achieving that goal since they provide a tool for making hints to a user what changes within some values of flexible attributes are needed for a given set of objects to re-classify them into a new decision class. There are two aspects of interestingness of rules that have been studied in data mining literature, objective and subjective measures [8], [1], [14], [15], [23]. In this paper we focus on a cost of an action rule which was introduced in [22] as an objective measure. An action rule was called interesting if its cost is below and support higher than some user-defined threshold values. We assume that our attributes are hierarchical and we focus on solving the failing problem of interesting action rules discovery. Our process is cooperative and it has some similarities with cooperative answering of queries presented in [3], [5], [6]. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

28. Partition-Based Approach to Processing Batches of Frequent Itemset Queries.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Grudzinski, Przemyslaw, Wojciechowski, Marek, and Zakrzewicz, Maciej
Abstract: We consider the problem of optimizing processing of batches of frequent itemset queries. The problem is a particular case of multiple-query optimization, where the goal is to minimize the total execution time of the set of queries. We propose an algorithm that is a combination of the Mine Merge method, previously proposed for processing of batches of frequent itemset queries, and the Partition algorithm for memory-based frequent itemset mining. The experiments show that the novel approach outperforms the original Mine Merge and sequential processing in majority of cases. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

29. Mining Interest Navigation Patterns Based on Hybrid Markov Model.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Yijun Yu, Huaizhong Lin, Yimin Yu, and Chun Chen
Abstract: Each user accesses a Website with certain interest. The interest is associated with his navigation patterns. The interest navigation patterns represent different interest of the users. In this paper, hybrid Markov model is proposed for interest navigation pattern discovery. The novel model is better in prediction overlay rate and prediction correct rate than traditional Markov models. User group interest is also defined in this paper. The probability of user group interest navigation from one page to another is computed by navigation path characteristics and time characteristics. Compared with the previous ones, the results of the experiment show that the performance is improved efficiently by the hybrid Markov model. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

30. Optimal Associative Neighbor Mining Using Attributes for Ubiquitous Recommendation Systems.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Kyung-Yong Jung, Hee-Joung Hwang, and Un-Gu Kang
Abstract: Ubiquitous recommendation systems predict new items of interest for a user, based on predictive relationship discovered between the user and other participants in Ubiquitous Commerce. In this paper, optimal associative neighbor mining, using attributes, for the purpose of improving accuracy and performance in ubiquitous recommendation systems, is proposed. This optimal associative neighbor mining selects the associative users that have similar preferences by extracting the attributes that most affect preferences. The associative user pattern comprising 3-AUs (groups of associative users composed of 3-users), is grouped through the ARHP algorithm. The approach is empirically evaluated, for comparison with the nearest-neighbor model and k-means clustering, using the MovieLens datasets. This method can solve the large-scale dataset problem without deteriorating accuracy quality. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

31. Assisted Query Formulation Using Normalised Word Vector and Dynamic Ontological Filtering.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Dreher, Heinz, and Williams, Robert
Abstract: Information seekers using the usual search techniques and engines are delighted by the sheer power of the technology at their command - speed, quantity. Upon closer inspection of the results, and reflection upon the next stages of the information seeking knowledge work, users are typically overwhelmed, and frustrated. We propose a partial solution by focusing on the query formulation aspect of the information seeking problem. First we introduce our version of a semantic analysis algorithm, named Normalised Word Vector, and explain its application in assisted query formulation. Secondly we introduce our ideas of supporting query refinement via Dynamic Ontological Filtering. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

32. Flexible Shape-Based Query Rewriting.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Chalhoub, Georges, Chbeir, Richard, and Yetongnon, Kokou
Abstract: A visual query is based on pictorial representation of conceptual entities and operations. One of the most important features used in visual queries is the shape. Despite its intuitive writing, a shape-based visual query usually suffers of a complexity processing related to two major parameters: 1-the imprecise user request, 2-shapes may undergo several types of transformation. Several methods are provided in the literature to assist the user during query writing. On one hand, relevance feedback technique is widely used to rewrite the initial user query. On the other hand, shape transformations are considered by current shape-based retrieval approaches without any user intervention. In this paper, we present a new cooperative approach based on the shape neighborhood concept allowing the user to rewrite a shape-based visual query according to his preferences with high flexibility in terms of including (or excluding) only some shape transformations and of result sorting. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

33. Dynamically Personalized Web Service System to Mobile Devices.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Sanggil Kang, Wonik Park, and Young-Kuk Kim
Abstract: We introduce a novel personalized web service system through mobile devices. By providing only users' preferred web pages or smaller readable sections, service elements, the problem of the limitation of resource of mobile devices can be solved. In this paper, the preferred service elements are obtained from the statistical preference transactions among web pages for each web site. In computing the preference, we consider the ratio of the length of each web page and users' staying time on it. Also, our system dynamically provides the personalized web service according to the different three cases such as the beginning stage, the positive feedback, and the negative feedback. In the experimental section, we demonstrate our personalized web service system and show how much the resource of mobile devices can be saved. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

34. Using Dynamic Fuzzy Ontologies to Understand Creative Environments.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Calegari, Silvia, and Loregian, Marco
Abstract: This paper presents a method to model knowledge in creative environments using dynamic fuzzy ontologies. Dynamic fuzzy ontologies are ontologies that evolve in time to adapt to the environment in which they are used, and whose taxonomies and relationships among concepts are enriched with fuzzy weights (i.e., numeric values between 0 and 1). Such cognitive artifacts can provide for higher user awareness in learning environments, as well as for greater creative stimulus for knowledge discovery. This paper gives the definitions of dynamic fuzzy ontologies, the details of how fuzzy values are dynamically assigned to concepts and relations, and presents an experimental evaluation of the proposed approach. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

35. Improving the User-System Interaction in a Web Multi-agent System Using Fuzzy Multi-granular Linguistic Information.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Herrera-Viedma, E., Porcel, C., Lopez-Herrera, A.G., Alonso, S., and Zafra, A.
Abstract: Nowadays, information gathering in Internet is a complex activity and Internet users need systems to assist them to obtain the information required. In an earlier studies [5, 6, 16] we presented different fuzzy linguistic multi-agent models for helping users in their information gathering processes on the Web. In this paper, we present a new fuzzy linguistic multi-agent model to access information on the Web that incorporates the use of fuzzy multi-granular linguistic modeling to improve its user-system interaction and be more user-friendly. Keywords: Web, intelligent agents, fuzzy linguistic modelling. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

36. The Lookahead Principle for Preference Elicitation: Experimental Results.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Viappiani, Paolo, Faltings, Boi, and Pu, Pearl
Abstract: Preference-based search is the problem of finding an item that matches best with a user's preferences. User studies show that example-based tools for preference-based search can achieve significantly higher accuracy when they are complemented with suggestions chosen to inform users about the available choices. We discuss the problem of eliciting preferences in example-based tools and present the lookahead principle for generating suggestions. We compare two different implementations of this principle and we analyze logs of real user interactions to evaluate them. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

37. Personalized Web Recommendation Based on Path Clustering.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Yijun Yu, Huaizhong Lin, Yimin Yu, and Chun Chen
Abstract: Each user accesses a Website with certain interests. The interest can be manifested by the sequence of each Web user access. The access paths of all Web users can be clustered. The effectiveness and efficiency are two problems in clustering algorithms. This paper provides a clustering algorithm for personalized Web recommendation. It is path clustering based on competitive agglomeration (PCCA). The path similarity and the center of a cluster are defined for the proposed algorithm. The algorithm relies on competitive agglomeration to get best cluster numbers automatically. Recommending based on the algorithm doesn't disturb users and needn't any registration information. Experiments are performed to compare the proposed algorithm with two other algorithms and the results show that the improvement of recommending performance is significant. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

38. A Flexible News Filtering Model Exploiting a Hierarchical Fuzzy Categorization.

Author: Larsen, Henrik Legind, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Bordogna, Gloria, Pagani, Marco, Pasi, Gabriella, and Villa, Robert
Abstract: In this paper we present a novel news filtering model based on flexible and soft filtering criteria and exploiting a fuzzy hierarchical categorization of news. The filtering module is designed to provide news professionals and general users with an interactive and personalised tool for news gathering and delivery. It exploits content-based filtering criteria and category-based filtering techniques to deliver to the user a ranked list of either news or clusters of news. In fact, if the user prefers to have a synthetic view of the topics of recent news pushed by the stream, the system filters groups (clusters) of news having homogenous contents, identified automatically by the application of a fuzzy clustering algorithm that organizes the recent news into a fuzzy hierarchy. The filter can be trained explicitly by the user to learn his/her interests as well as implicitly by monitoring his/her interaction with the system. Several filtering criteria can be applied to select and rank news to the users based on the user's information preferences and presentation preferences. User preferences specify what information (the contents of interest) is relevant to the user, the sources that provide reliable information, and the period of time during which the information remains relevant. Each individual news or cluster of news homogeneous with respect to their content is selected based on a customizable multi criteria decision making approach and ranked based on a combination of criteria specified by the user in his/her presentation preferences. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

39. Robust Query Processing for Personalized Information Access on the Semantic Web.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Dolog, Peter, Stuckenschmidt, Heiner, and Wache, Holger
Abstract: Research in Cooperative Query answering is triggered by the observation that users are often not able to correctly formulate queries to databases that return the intended result. Due to a lack of knowledge of the contents and the structure of a database, users will often only be able to provide very broad queries. Existing methods for automatically refining such queries based on user profiles often overshoot the target resulting in queries that do not return any answer. In this paper, we investigate methods for automatically relaxing such over-constraint queries based on domain knowledge and user preferences. We describe a framework for information access that combines query refinement and relaxation in order to provide robust, personalized access to heterogeneous RDF data as well as an implementation in terms of rewriting rules and explain its application in the context of e-learning systems. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

40. Search Strategies for Finding Annotations and Annotated Documents: The FAST Service.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Agosti, Maristella, and Ferro, Nicola
Abstract: This paper discusses two kinds of search strategies supported by the Flexible Annotation Service Tool (FAST), an annotation service that can be used by different Digital Library Management Systems (DLMSs). The first strategy concerns the search and retrieval of annotations, considered as stand-alone documents; while, the second one regards how to exploit annotations in order to search and retrieve annotated documents which are relevant for a user query. This paper describes the proposed search strategies in the light of the architectural design choices needed to support them. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

41. Why Using Structural Hints in XML Retrieval?

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Sauvagnat, Karen, Boughanem, Mohand, and Chrisment, Claude
Abstract: When querying XML collections, users cannot always express their need in a precise way. Systems should therefore support vagueness at both the content and structural level of queries. This paper present a relevance-oriented method for ranking XML components. The aim here is to evaluate whether structural hints help to better answer the user needs. We experiment (within the INEX framework) with users needs expressed in a flexible way (i.e with ou without structural hints). Results show that they clearly improve performance, even if they are expressed in an "artificial way". Relevance seems therefore to be closely linked to structure. Moreover, too complex structural hints do not lead to better results. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

42. Using a Fuzzy Object-Relational Database for Colour Image Retrieval.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Barranco, Carlos D., Medina, Juan M., Chamorro-Martínez, Jesús, and Soto-Hidalgo, José M.
Abstract: The paper presents a fuzzy database management system, and a fuzzy method for dominant colour description of images, on which an image retrieval system is built. The paper shows the suitability of the fuzzy database management system for this kind of applications when the images are characterized by fuzzy data. The synergy of these two introduced components, improves traditional image retrieval systems in three aspects: natural and automatic image description, a natural and easy query language, and high performance in query resolution. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

43. Fuzzy Query Answering in Motor Racing Domain.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Bandini, Stefania, Mereghetti, Paolo, and Radaelli, Paolo
Abstract: Nuances in natural languages can be useful to effectively describe preferences and constraints over a complex and few formalized domain. In this paper we describe the architecture of a query answering system for the domain of motor racing which uses fuzzy logic and domain knowledge in order to carry out searches dealing with vague expression, either as search constraints or as relationship between entities attribute values. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

44. On Tuning OWA Operators in a Flexible Querying Interface.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Zadrożny, Sławomir, and Kacprzyk, Janusz
Abstract: The use of the Yager's OWA operators within a flexible querying interface is discussed. The key issue is the adaptation of an OWA operator to the specifics of a user's query. Some well-known approaches to the manipulation of the weights vector are reconsidered and a new one is proposed that is simple and efficient. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

45. Approximate Querying of XML Fuzzy Data.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Buche, Patrice, Dibie-Barthélemy, Juliette, and Wattez, Fanny
Abstract: The MIEL++ system integrates data expressed in two different formalisms: a relational database and an XML database. The XML database is filled with data semi-automatically retrieved from the Web, which have been semantically enriched according to the ontology used in the relational database. These data may be imprecise and represented as possibility distributions. The MIEL++ querying system scans the two databases simultaneously in a transparent way for the end-user. To scan the XML database, the MIEL query is translated into an XML tree query. In this paper, we propose to introduce flexibility into the query processing of the XML database, in order to take into account the imperfections due to the semantic enrichment of its data. This flexibility relies on fuzzy queries and query rewriting which consists in generating a set of approximate queries from an original query using three transformation techniques: deletion, renaming and insertion of query nodes. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

46. A Hierarchical Document Clustering Environment Based on the Induced Bisecting k-Means.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Archetti, F., Campanelli, P., Fersini, E., and Messina, E.
Abstract: The steady increase of information on WWW, digital library, portal, database and local intranet, gave rise to the development of several methods to help user in Information Retrieval, information organization and browsing. Clustering algorithms are of crucial importance when there are no labels associated to textual information or documents. The aim of clustering algorithms, in the text mining domain, is to group documents concerning with the same topic into the same cluster, producing a flat or hierarchical structure of clusters. In this paper we present a Knowledge Discovery System for document processing and clustering. The clustering algorithm implemented in this system, called Induced Bisecting k-Means, outperforms the Standard Bisecting k-Means and is particularly suitable for on line applications when computational efficiency is a crucial aspect. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

47. Evaluation of System Measures for Incomplete Relevance Judgment in IR.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Wu, Shengli, and McClean, Sally
Abstract: Incomplete relevance judgment has become a norm for the evaluation of some major information retrieval evaluation events such as TREC, but its effect on some system measures has not been well understood. In this paper, we evaluate four system measures, namely mean average precision, R-precision, normalized average precision over all documents, and normalized discount cumulative gain, under incomplete relevance judgment. Among them, the measure of normalized average precision over all documents is introduced, and both mean average precision and R-precision are generalized for graded relevance judgment. These four measures have a common characteristic: complete relevance judgment is required for the calculation of their accurate values. We empirically investigate these measures through extensive experimentation of TREC data and aim to find the effect of incomplete relevance judgment on them. From these experiments, we conclude that incomplete relevance judgment affects all these four measures' values significantly. When using the pooling method in TREC, the more incomplete the relevance judgment is, the higher the values of all these measures usually become. We also conclude that mean average precision is the most sensitive but least reliable measure, normalized discount cumulative gain and normalized average precision over all documents are the most reliable but least sensitive measures, while R-precision is in the middle. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

48. Highly Heterogeneous XML Collections: How to Retrieve Precise Results?

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Sanz, Ismael, Mesiti, Marco, Guerrini, Giovanna, and Llavori, Rafael Berlanga
Abstract: Highly heterogeneous XML collections are thematic collections exploiting different structures: the parent-child or ancestor-descendant relationships are not preserved and vocabulary discrepancies in the element names can occur. In this setting current approaches return answers with low precision. By means of similarity measures and semantic inverted indices we present an approach for improving the precision of query answers without compromising performance. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

49. Towards Flexible Information Retrieval Based on CP-Nets.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Boubekeur, Fatiha, Boughanem, Mohand, and Tamine-Lechani, Lynda
Abstract: This paper describes a flexible information retrieval approach based on CP-Nets (Conditional Preferences Networks). The CP-Net formalism is used for both representing qualitative queries (expressing user preferences) and representing documents in order to carry out the retrieval process. Our contribution focuses on the difficult task of term weighting in the case of qualitative queries. In this context, we propose an accurate algorithm based on UCP-Net features to automatically weight Boolean queries. Furthermore, we also propose a flexible approach for query evaluation based on a flexible aggregation operator adapted to the CP-Net semantics. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

50. A Fuzzy Extension for the XPath Query Language.

Author: Larsen, Henrik Legind, Pasi, Gabriella, Ortiz-Arroyo, Daniel, Andreasen, Troels, Christiansen, Henning, Campi, Alessandro, Guinea, Sam, and Spoletini, Paola
Abstract: XML has become a widespread format for data exchange over the Internet. The current state of the art in querying XML data is represented by XPath and XQuery, both of which define binary predicates. In this paper, we advocate that binary selection can at times be restrictive due to very nature of XML, and to the uses that are made of it. We therefore suggest a querying framework, called FXPath, based on fuzzy logics. In particular, we propose the use of fuzzy predicates for the definition of more "vague" and softer queries. We also introduce a function called "deep-similar", which aims at substituting XPath's typical "deep-equal" function. Its goal is to provide a degree of similarity between two XML trees, assessing whether they are similar both structure-wise and content-wise. The approach is exemplified in the field of e-learning metadata. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

63 results on '"Ortiz-Arroyo, Daniel"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources