27 results on '"Andreas Rauber"'
Search Results
2. Explainable cyber-physical energy systems based on knowledge graph
- Author
-
Andreas Rauber, Ralf Mosshammer, Peb Ruswono Aryan, Marta Sabou, Alfred Einfalt, Tomasz Miksa, Daniel Hauer, and Fajar J. Ekaputra
- Subjects
Demand response ,Causality (physics) ,Range (mathematics) ,Smart grid ,Computer science ,Scale (chemistry) ,Energy (esotericism) ,Graph (abstract data type) ,Ontology (information science) ,Data science - Abstract
Explainability can help cyber-physical systems alleviating risk in automating decisions that are affecting our life. Building an explainable cyber-physical system requires deriving explanations from system events and causality between the system elements. Cyber-physical energy systems such as smart grids involve cyber and physical aspects of energy systems and other elements, namely social and economic. Moreover, a smart-grid scale can range from a small village to a large region across countries. Therefore, integrating these varieties of data and knowledge is a fundamental challenge to build an explainable cyber-physical energy system. This paper aims to use knowledge graph based framework to solve this challenge. The framework consists of an ontology to model and link data from various sources and graph-based algorithm to derive explanations from the events. A simulated demand response scenario covering the above aspects further demonstrates the applicability of this framework.
- Published
- 2021
- Full Text
- View/download PDF
3. Linked data processing provenance
- Author
-
Tuan-Dat Trinh, Ba-Lam Do, Peter Wetz, Peb Ruswono Aryan, Elmar Kiesling, Fajar J. Ekaputra, A Min Tjoa, and Andreas Rauber
- Subjects
Data processing ,Process (engineering) ,Computer science ,Reliability (computer networking) ,Context (language use) ,02 engineering and technology ,Linked data ,computer.software_genre ,Data science ,World Wide Web ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Mashup ,computer ,Data journalism ,TRACE (psycholinguistics) - Abstract
The growth of Linked Data has created a promising environment for data exploration and a growing number of tools allow users to interactively integrate data from various sources. Eliciting the reliability of the results of such ad-hoc integration processes, consistently recreating those results, and identifying changes upon re-execution, however, can be difficult. Automated process provenance trail creation can provide major benefits in this context, because (i) it enables users to trace the contribution of individual sources and processing steps to the final outcome and judge whether the result can be trusted; (ii) it ensures repeatability and raises the trustworthiness of results; (iii) it ideally enables reconstruction of Linked Data integration processes from the provenance information embedded in the final result. In this paper, we present a provenance model that facilitates automatic generation of semantic provenance information for generic Linked Data integration processes. We implement the generic model in a collaborative mashup environment and evaluate it by means of an example application. We find that the model provides a solid foundation for verifiability and contributes towards making Linked Data integration processes more open, transparent, and reusable, which is crucial in domains where the origin of data is essential, such as, for instance, statistical analyses, scientific research, and data journalism.
- Published
- 2017
- Full Text
- View/download PDF
4. When is the Time Ripe for Natural Language Processing for Patent Passage Retrieval?
- Author
-
Allan Hanbury, Linda Andersson, Joao Palotti, Andreas Rauber, and Mihai Lupu
- Subjects
Vocabulary ,Relation (database) ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,computer.software_genre ,Domain (software engineering) ,Text mining ,Rule-based machine translation ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,media_common ,Information retrieval ,business.industry ,05 social sciences ,Word formation ,Noun phrase ,Information extraction ,Domain knowledge ,Language model ,Artificial intelligence ,0509 other social sciences ,050904 information & library sciences ,business ,computer ,Natural language processing - Abstract
Patent text is a mixture of legal terms and domain specific terms. In technical English text, a multi-word unit method is often deployed as a word formation strategy in order to expand the working vocabulary, i.e. introducing a new concept without the invention of an entirely new word. In this paper we explore query generation using natural language processing technologies in order to capture domain specific concepts represented as multi-word units. In this paper we examine a range of query generation methods using both linguistic and statistical information. We also propose a new method to identify domain specific terms from other more general phrases. We apply a machine learning approach using domain knowledge and corpus linguistic information in order to learn domain specific terms in relation to phrases' Termhood values. The experiments are conducted on the English part of the CLEF-IP 2013 test collection. The outcome of the experiments shows that the favoured method in terms of PRES and recall is when a language model is used and search terms are extracted with a part-of-speech tagger and a noun phrase chunker. With our proposed methods we improve each evaluation metric significantly compared to the existing state-of-the-art for the CLEP-IP 2013 test collection: for PRES@100 by 26% (0.544 from 0.433), for recall@100 by 17% (0.631 from 0.540) and on document MAP by 57% (0.300 from 0.191).
- Published
- 2016
- Full Text
- View/download PDF
5. Session details: Panels
- Author
-
Hideo Joho and Andreas Rauber
- Subjects
Multimedia ,Computer science ,Session (computer science) ,computer.software_genre ,computer - Published
- 2015
- Full Text
- View/download PDF
6. Resilient Web Services for Timeless Business Processes
- Author
-
Marco Unterberger, Andreas Rauber, Tomasz Miksa, and Rudolf Mayer
- Subjects
Web standards ,medicine.medical_specialty ,Web development ,computer.internet_protocol ,Computer science ,business.industry ,Services computing ,computer.software_genre ,Business Process Execution Language ,World Wide Web ,medicine ,Web service ,business ,WS-Policy ,computer ,Web modeling ,Data Web - Abstract
Many business and scientific processes make extensive use of service-oriented architectures, using distributed services. These are often provided by third parties and are thus not under direct control of process owners. In this paper we discuss the issues of ensuring continuous and faithful ex- ecution of processes in distributed environments, focusing specifically on Web Services. Recently, we introduced a specification of Resilient Web Services, that makes current Web Services more robust, and a framework for the moni- toring of Web Services, that allows detecting anomalies. In this paper, we describe alternative implementations of the framework for monitoring of Web Services. We also present possible approaches easing the deployment of Resilient Web Services: a framework consisting of tools deployable at the Web Service operator site enabling easy transformation of a regular Web Service into a Resilient Web Service, and a reg- istry with notifications that decorates existing Web Services with resilient methods.
- Published
- 2014
- Full Text
- View/download PDF
7. Are Species Identification Tools Biodiversity-friendly?
- Author
-
Concetto Spampinato, Hervé Goëau, Andreas Rauber, Alexis Joly, WP Willem Pier Vellinga, Hervé Glotin, Bob Fisher, Pierre Bonnet, and Henning Müller
- Subjects
Computer science ,%22">Fish ,Identification (biology) ,Ecological data ,Data science - Abstract
This paper discusses the results of the LifeCLEF 2014 multimedia identification challenges with regards to the requirements of real-world ecological surveillance systems. In particular, we study the identification performances of the evaluated systems as a function of the ordinariness or rarity of the species in the dataset. This allows us to assess the ability of the underlying methods to be robust to heavily tailed distributions such as the ones encountered in real-world collections of life observations. Results show that all methods are more or less affected by the long-tail curse but that the best methods making use of classifiers with good discrimi- nation capacities do resist the phenomenon pretty well.
- Published
- 2014
- Full Text
- View/download PDF
8. A Combined Approach of Structured and Non-structured IR in Multimodal Domain
- Author
-
Serwah Sabetghadam, Ralf Bierig, Andreas Rauber, and Mihai Lupu
- Subjects
Information retrieval ,Computer science ,Multimodal data ,Graph (abstract data type) ,Document retrieval ,Hybrid approach ,Semantic Web ,Image retrieval ,Combined approach - Abstract
We present a generic model for multimodal information retrieval, leveraging different information sources to improve the effectiveness of a retrieval system. The proposed method is able to take into account both explicit and latent semantics present in the data and can be used to answer complex queries, not currently answerable neither by document retrieval systems, nor by semantic web systems. By providing a hybrid approach combining IR and structured search techniques, we prepare a framework applicable to multimodal data collections. To test its effectiveness, we instantiate the model for an image retrieval task.
- Published
- 2014
- Full Text
- View/download PDF
9. Feature selection in a cartesian ensemble of feature subspace classifiers for music categorisation
- Author
-
Andreas Rauber, Pedro J. Ponce de León, Carlos Pérez-Sancho, José M. Iñesta, and Rudolf Mayer
- Subjects
business.industry ,Dimensionality reduction ,Context (language use) ,Linear classifier ,Feature selection ,Pattern recognition ,Machine learning ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,Feature (computer vision) ,One-class classification ,Artificial intelligence ,business ,computer ,Subspace topology ,Curse of dimensionality ,Mathematics - Abstract
In this paper, we evaluate the impact of feature selection on the classification accuracy and the achieved dimensionality reduction, which benefits the time needed on training classification models. Our classification scheme therein is a Cartesian ensemble classification system, based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. We use it for the ensemble classification of multiple feature sets from the audio and symbolic domains. We present an extensive set of experiments in the context of music genre classification, based on Music IR benchmark datasets. We show that while feature selection does not benefit classification accuracy, it greatly reduces the dimensionality of each feature subspace, and thus adds to great gains in the time needed to train the individual classification models that form the ensemble.
- Published
- 2010
- Full Text
- View/download PDF
10. Identification of low/high retrievable patents using content-based features
- Author
-
Andreas Rauber and Shariq Bashir
- Subjects
Set (abstract data type) ,Handling system ,Identification (information) ,Measure (data warehouse) ,Information retrieval ,Computer science ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Content (measure theory) ,Patent retrieval ,Visual Word ,Retrievability - Abstract
Document retrievability is a measurement used in informa- tion retrieval for identifying the bias of retrieval systems. In order to measure system bias for a specific document col- lection, an exhaustive set of queries is processed, measuring the frequency with which each document is retrieved. For better understanding and handling system bias, we need to understand the characteristics of documents that influence retrievability, and ideally be able to identify documents with high and low retrievability in advance. For this purpose, we identify a number of content-based features, which can be used effectively to classify a corpus into documents with low and high retrievability w.r.t a specific retrieval system. Our experiments on patent collections show that these features can achieve more than 80% classification accuracy on dif- ferent systems, and hint at the need to combine different retrieval systems for optimizing recall.
- Published
- 2009
- Full Text
- View/download PDF
11. Interacting with (semi-) automatically extracted context of digital objects
- Author
-
Robert Neumayer, Rudolf Mayer, and Andreas Rauber
- Subjects
Range (mathematics) ,Information retrieval ,Computer science ,Interpretation (philosophy) ,Context (language use) ,Granularity ,Digital library ,Appropriate use ,Task (project management) - Abstract
The context in which digital objects are created, modified, or used is essential for the interpretation of information entities, for retrieval settings, for establishing their authenticity, as well as ensuring appropriate use. Therefore, determining this context of creation and use of digital objects is an essential task for many areas and applications, from (huge) digital library settings to end-user applications such as search. However, context is notoriously difficult and laboursome to establish and document, and when it has to be entered and maintained manually by the creator of the digital objects, it is often missing or partially incomplete or incorrect. Thus, this paper proposes an approach to (semi-) automatically determine the context of creation and usage of digital objects. Various facets of context along different dimensions are automatically detected, and are combined in pivot-table inspired views, at multiple levels of granularity, which then allow the extraction of the most appropriate connections to other digital objects. Finally, this contact can be used for a range of applications, such as search and navigation.
- Published
- 2009
- Full Text
- View/download PDF
12. Requirements modelling and evaluation for digital preservation
- Author
-
Christoph Becker and Andreas Rauber
- Subjects
Database ,Requirements engineering ,Process (engineering) ,Computer science ,business.industry ,Context (language use) ,computer.software_genre ,Automation ,Documentation ,Digital preservation ,Component (UML) ,business ,Software engineering ,computer ,Selection (genetic algorithm) - Abstract
Most methods for the general problem of Commercial-off-the-shelf component selection use goal-oriented requirements modelling and multi-criteria decision making techniques and are applicable across a wide range of domains. This usually implies high levels of complexity. Recently a very specific selection problem emerged in the context of digital preservation. The selection of the most suitable tool to keep a type of digital object alive when the original technical environment ceases to exist is a highly complex domain-specific selection problem with several peculiarities: Highly homogeneous functionality across tools, complex evaluation of quality across settings, and a high need for automation, standardisation, and documentation. This paper describes an evidence-based empirical methodology for COTS component selection in digital preservation through controlled experimentation. We describe the specific selection problem, show how the process of utility analysis can be tailored to fit the problem space and describe the methodology, which is geared towards automated evaluation in an empirical setting. We outline existing tool support and discuss case studies and future directions.
- Published
- 2009
- Full Text
- View/download PDF
13. Ambient music experience in real and virtual worlds using audio similarity
- Author
-
Jakob Frank, Ewald Peiszer, Thomas Lidy, Ronald Genswaider, and Andreas Rauber
- Subjects
Metadata ,Range (music) ,Multimedia ,Point (typography) ,Human–computer interaction ,Event (computing) ,Computer science ,Pop music automation ,Metaverse ,computer.software_genre ,Set (psychology) ,Music visualization ,computer - Abstract
Sound and, specifically, music is a medium that is used for a wide range of purposes in different situations in very different ways. Ways for music selection and consumption may range from completely passive, almost unnoticed perception of background sound environments to the very specific selection of a particular recording of a piece of music with a specific orchestra and conductor on a certain event. Different systems and interfaces exist for the broad range of needs in music consumption. Locating a particular recording is well supported by traditional search interfaces via metadata. Other interfaces support the creation of playlists via artist or album selection, up to more artistic installation of sound environments that users can navigate through. In this paper we will present a set of systems that support the creation of as well as the navigation in musical spaces, both in the real world as well as in virtual environments. We show some common principles and point out further directions for a more direct coupling of the various spaces and interaction methods.
- Published
- 2008
- Full Text
- View/download PDF
14. Map-based music interfaces for mobile devices
- Author
-
Thomas Lidy, Andreas Rauber, Jakob Frank, and Peter Hlavac
- Subjects
Self-organizing map ,Multimedia ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,business.industry ,computer.software_genre ,Music visualization ,Visualization ,Audio analyzer ,business ,computer ,Mobile device ,Digital audio ,Graphical user interface - Abstract
The pervasion of digital music calls for novel techniques to search, retrieve and access music collections. Particularly mobile devices are, due to their limited display size and input capabilities, in a need of possibilities for intuitive and quick selection of music that go beyond mere browsing through song lists and directories. We propose a graphical user interface for mobile devices presenting a music map that organizes a music collection automatically by sound similarity through audio analysis. This map provides an overview over large audio collections and offers several interaction possibilities to give users a quick and direct access to their music. It allows instant creation of playlists based on music of a desired genre by pointing on clusters or drawing paths on the map. The application not only eases access to music, but also enables novel application scenarios for collaborative music experience. The software has been implemented for a range of mobile devices, such as PDAs and smartphones.
- Published
- 2008
- Full Text
- View/download PDF
15. Combination of audio and lyrics features for genre classification in digital audio collections
- Author
-
Robert Neumayer, Rudolf Mayer, and Andreas Rauber
- Subjects
Structure (mathematical logic) ,Multimedia ,Computer science ,Rhyme ,media_common.quotation_subject ,Feature (machine learning) ,Music information retrieval ,Musical ,computer.software_genre ,Lyrics ,computer ,Digital audio ,media_common - Abstract
In many areas multimedia technology has made its way into mainstream. In the case of digital audio this is manifested in numerous online music stores having turned into profitable businesses. The widespread user adaption of digital audio both on home computers and mobile players show the size of this market. Thus, ways to automatically process and handle the growing size of private and commercial collections become increasingly important; along goes a need to make music interpretable by computers. The most obvious representation of audio files is their sound - there are, however, more ways of describing a song, for instance its lyrics, which describe songs in terms of content words. Lyrics of music may be orthogonal to its sound, and differ greatly from other texts regarding their (rhyme) structure. Consequently, the exploitation of these properties has potential for typical music information retrieval tasks such as musical genre classification; so far, there is a lack of means to efficiently combine these modalities. In this paper, we present findings from investigating advanced lyrics features such as the frequency of certain rhyme patterns, several parts-of-speech features, and statistic features such as words per minute (WPM). We further analyse in how far a combination of these features with existing acoustic feature sets can be exploited for genre classification and provide experiments on two test collections.
- Published
- 2008
- Full Text
- View/download PDF
16. Personal & soho archiving
- Author
-
Andreas Rauber, Kevin Stadler, Stephan Strodl, and Florian Motlik
- Subjects
Home environment ,SIMPLE (military communications protocol) ,Multimedia ,business.industry ,Computer science ,Best practice ,computer.software_genre ,Term (time) ,Software ,Backup ,Digital preservation ,Preservation solutions ,business ,computer - Abstract
Digital objects require appropriate measures for digital preservation to ensure that they can be accessed and used in the near and far future. While heritage institutions have been addressing the challenges posed by digital preservation needs for some time, private users and SOHOs (Small Office/Home Office) are less prepared to handle these challenges. Yet, both have increasing amounts of data that represent considerable value, be it office documents or family photographs. Backup, common practice of home users, avoids the physical loss of data, but it does not prevent the loss of the ability to render and use the data in the long term. Research and development in the area of digital preservation is driven by memory institutions and large businesses. The available tools, services and models are developed to meet the demands of these professional settings. This paper analyses the requirements and challenges of preservation solutions for private users and SOHOs. Based on the requirements and supported by available tools and services, we are designing and implementing a home archiving system to provide digital preservation solutions specifically for digital holdings in the small office and home environment. It hides the technical complexity of digital preservation challenges and provides simple and automated services based on established best practice examples. The system combines bit preservation and logical preservation strategies to avoid loss of data and the ability to access and use them. A first software prototype, called Hoppla, is presented in this paper.
- Published
- 2008
- Full Text
- View/download PDF
17. Plato
- Author
-
Christoph Becker, Hannes Kulovits, Andreas Rauber, and Hans Hofman
- Subjects
Decision support system ,Emulation ,Knowledge management ,computer.internet_protocol ,business.industry ,Computer science ,Process (engineering) ,Service-oriented architecture ,Plan (drawing) ,Object (computer science) ,Digital preservation ,Architecture ,Software engineering ,business ,computer - Abstract
The fast changes of technologies in today's information landscape have considerably shortened the lifespan of digital objects. Digital preservation has become a pressing challenge. Different strategies such as migration and emulation have been proposed; however, the decision for a specific tool e.g. for format migration or an emulator is very complex. The process of evaluating potential solutions against specific requirements and building a plan for preserving a given set of objects is called preservation planning. So far, it is a mainly manual, sometimes ad-hoc process with little or no tool support. This paper presents a service-oriented architecture and decision support tool that implements a solid preservation planning process and integrates services for content characterisation, preservation action and automatic object comparison to provide maximum support for preservation planning endeavours.
- Published
- 2008
- Full Text
- View/download PDF
18. A generic XML language for characterising objects to support digital preservation
- Author
-
Volker Heydegger, Jan Schnasse, Manfred Thaller, Christoph Becker, and Andreas Rauber
- Subjects
World Wide Web ,Computer science ,Digital preservation ,Dominance (economics) ,computer.internet_protocol ,media_common.quotation_subject ,Quality (business) ,Context (language use) ,computer ,XML ,media_common - Abstract
The dominance of digital objects in today's information landscape has changed the way humankind creates and exchanges information. However, it has also brought an entirely new problem: the longevity of digital objects. Due to the fast changes in technologies, digital documents have a short lifespan before they become obsolete. Digital preservation, i.e. actions to ensure longevity of digital information, thus has become a pressing challenge. Different strategies such as migration and emulation have been proposed; however, the decision between available tools for format migration is very complex. Preservation planning supports decision makers in reaching accountable decisions by evaluating potential strategies against well-defined requirements. Especially the evaluation of different migration tools for digital preservation has to rely on validating the converted objects and thus on an analysis of the logical structure and the content of documents. This paper presents the eXtensible Characterisation Languages (XCL) that support the automatic validation of document conversions and the evaluation of migration quality by hierarchically decomposing a document and representing documents from different sources in an abstract XML language. We present the context of the development of these languages and tools and describe the overall concept and features of the languages and how they can be applied to the evaluation of digital preservation solutions.
- Published
- 2008
- Full Text
- View/download PDF
19. Shaping 3D multimedia environments
- Author
-
Andreas Pesenhofer, Thomas Lidy, Helmut Berger, Dieter Merkl, Andreas Rauber, Ronald Genswaider, and Michael Dittenbach
- Subjects
Self-organizing map ,Multimedia ,Computer science ,Game engine ,Human–computer interaction ,Interface (computing) ,computer.software_genre ,computer - Abstract
In this paper we describe The MediaSquare, a 3D Multimedia Environment we are currently developing, where users are impersonated as avatars enabling them to browse and experience multimedia content by literally walking through it. Users may engage in conversations with others, exchange experiences as well as collaboratively explore and enjoy the featured content. The combination of algorithms from the area of artificial intelligence with state-of-the-art 3D virtual environments creates an intuitive interface that provides access to automatically structured multimedia data taking advantage of spatial metaphors.
- Published
- 2007
- Full Text
- View/download PDF
20. How to choose a digital preservation strategy
- Author
-
Stephan Strodl, Christoph Becker, Robert Neumayer, and Andreas Rauber
- Subjects
Utility analysis ,Emulation ,Work (electrical) ,Risk analysis (engineering) ,Order (exchange) ,Digital preservation ,Computer science ,Management science ,Planning approach ,Digital library ,Variety (cybernetics) - Abstract
An increasing number of institutions throughout the world face legal obligations or business needs to collect and preserve digital objects over several decades. A range of tools exists today to support the variety of preservation strategies such as migration or emulation. Yet, different preservation requirements across institutions and settings make the decision on which solution to implement very diffcult.This paper presents the PLANETS Preservation Planning approach. It provides an approved way to make informed and accountable decisions on which solution to implement in order to optimally preserve digital objects for a given purpose. It is based on Utility Analysis to evaluate the performance of various solutions against well-defined requirements and goals. The viability of this approach is shown in a range of case studies for different settings. We present its application to two scenarios of web archives, two collections of electronic publications, and a collection of multimedia art. This work focuses on the different requirements and goals in the various preservation settings.
- Published
- 2007
- Full Text
- View/download PDF
21. Content-based organization and visualization of music archives
- Author
-
Elias Pampalk, Dieter Merkl, and Andreas Rauber
- Subjects
Self-organizing map ,Multimedia ,Computer science ,Metaphor ,Interface (Java) ,media_common.quotation_subject ,Feature extraction ,computer.software_genre ,Visualization ,Raw audio format ,Psychoacoustics ,User interface ,computer ,media_common - Abstract
With Islands of Music we present a system which facilitates exploration of music libraries without requiring manual genre classification. Given pieces of music in raw audio format we estimate their perceived sound similarities based on psychoacoustic models. Subsequently, the pieces are organized on a 2-dimensional map so that similar pieces are located close to each other. A visualization using a metaphor of geographic maps provides an intuitive interface where islands resemble genres or styles of music. We demonstrate the approach using a collection of 359 pieces of music.
- Published
- 2002
- Full Text
- View/download PDF
22. Content-based music indexing and organization
- Author
-
Andreas Rauber, Elias Pampalk, and Dieter Merkl
- Subjects
Self-organizing map ,Information retrieval ,Audio signal ,Multimedia ,Interface (Java) ,Electronic music ,Computer science ,Similarity (psychology) ,Search engine indexing ,Pop music automation ,computer.software_genre ,Cluster analysis ,computer - Abstract
While electronic music archives are gaining popularity, access to and navigation within these archives is usually limited to text-based queries or manually predefined genre category browsing. We present a system that automatically organizes a music collection according to the perceived sound similarity resembling genres or styles of music. Audio signals are processed according to psychoacoustic models to obtain a time-invariant representation of its characteristics. Subsequent clustering provides an intuitive interface where similar pieces of music are grouped together on a map display.
- Published
- 2002
- Full Text
- View/download PDF
23. Automatic text representation, classification and labeling in European law
- Author
-
Andreas Rauber, Erich Schweighofer, and Michael Dittenbach
- Subjects
Text corpus ,Structure (mathematical logic) ,Information retrieval ,Artificial neural network ,Computer science ,business.industry ,Search engine indexing ,Representation (systemics) ,computer.software_genre ,Legal research ,ComputingMethodologies_PATTERNRECOGNITION ,Text mining ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Segmentation ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
The huge text archives and retrieval systems of legal information have not achieved yet the representation in the well-known subject-oriented structure of legal commentaries. Content-based classification and text analysis remains a high priority research topic. In the joint KONTERM, SOM and LabelSOM projects, learning techniques of neural networks are used to achieve similar high compression rates of classification and analysis like in manual legal indexing. The produced maps of legal text corpora cluster related documents in units that are described with automatically selected descriptors. Extensive tests with text corpora in European case law have shown the feasibility of this approach. Classification and labeling proved very helpful for legal research. The Growing Hierarchical Self-Organizing Map represents very interesting generalities and specialties of legal text corpora. The segmentation into document parts improved very much the quality of labeling. The next challenge would be a change from tf × idf vector representation to a modified vector representation taking into account thesauri or ontologies considering learned properties of legal text corpora.
- Published
- 2001
- Full Text
- View/download PDF
24. Integrating automatic genre analysis into digital libraries
- Author
-
Andreas Rauber and Alexander Müller-Kögler
- Subjects
World Wide Web ,Structure (mathematical logic) ,Focus (computing) ,Information retrieval ,Content analysis ,Computer science ,Information system ,Representation (arts) ,Document clustering ,Digital library ,Visualization - Abstract
With the number and types of documents in digital library systems incr easing, tools for automatically organizing and presenting the content have to be found. While many approaches focus on topic-based organization and structuring, hardly any system incorporates automatic structural analysis and representation. Yet, genre information (unconsciously) forms one of the most distinguishing features in conventional libraries and in information searches. In this paper we present an approach to automatically analyze the structure of documents and to integrate this information into an automatically created content-based organization. In the resulting visualization, documents on similar topics, yet representing different genres, are depicted as books in differing colors. This representation supports users intuitively in locating relevant information presented in a relevant form.
- Published
- 2001
- Full Text
- View/download PDF
25. SOMLib
- Author
-
Andreas Rauber and Dieter Merkl
- Subjects
Structure (mathematical logic) ,Self-organizing map ,Information retrieval ,Artificial neural network ,Computer science ,Computational intelligence ,Document clustering ,Digital library ,Data science ,Field (computer science) ,Visualization - Abstract
Digital Libraries have gained tremendous interest with numerous research projects addressing the wealth of challenges in this field. While computational intelligence systems are being used for specific tasks in this arena, the majority of projects relies on conventional techniques for the basic structure of the library itself. With the SOMLib project we create a digital library system that uses a neural network-based core for library representation and query processing. The self-organizing map, a popular unsupervised neural network model, is used to automatically structure a document collection. Based on this core, additional modules integrate distributed libraries and create an intuitive representation of the library, automatically labeling the various topical sections in the document collection.
- Published
- 1999
- Full Text
- View/download PDF
26. A Framework for Automatic Labeling of Log Datasets from Model-driven Testbeds for HIDS Evaluation
- Author
-
Max Landauer, Maximilian Frank, Florian Skopik, Wolfgang Hotwagner, Markus Wurzenberger, and Andreas Rauber
- Full Text
- View/download PDF
27. Proceedings of the 4th Workshop on Patent Information Retrieval, PaIR '11, Glasgow, Scotland, UK, October 24, 2011
- Author
-
Mihai Lupu, Andreas Rauber, and Allan Hanbury
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.