Descriptor: "INF/01 - INFORMATICA" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"INF/01 - INFORMATICA"' showing total 5,525 results

Start Over Descriptor "INF/01 - INFORMATICA"

5,525 results on '"INF/01 - INFORMATICA"'

1. Sifting the debris: Patterns in the SNR population with unsupervised ML methods

Author: Bufano, F, Bordiu, C, Cecconello, T, Munari, M, Hopkins, A, Ingallinera, A, Leto, P, Loru, S, Riggi, S, Sciacca, E, Vizzari, G, De Marco, A, Buemi, C, Cavallaro, F, Trigilio, C, Umana, G, Bufano F., Bordiu C., Cecconello T., Munari M., Hopkins A., Ingallinera A., Leto P., Loru S., Riggi S., Sciacca E., Vizzari G., De Marco A., Buemi C. S., Cavallaro F., Trigilio C., Umana G., Bufano, F, Bordiu, C, Cecconello, T, Munari, M, Hopkins, A, Ingallinera, A, Leto, P, Loru, S, Riggi, S, Sciacca, E, Vizzari, G, De Marco, A, Buemi, C, Cavallaro, F, Trigilio, C, Umana, G, Bufano F., Bordiu C., Cecconello T., Munari M., Hopkins A., Ingallinera A., Leto P., Loru S., Riggi S., Sciacca E., Vizzari G., De Marco A., Buemi C. S., Cavallaro F., Trigilio C., and Umana G.
Published: 2024

2. Structure learning and knowledge extraction with Continuous Time Bayesian Network

Author: Bregoli, A, BREGOLI, ALESSANDRO, Bregoli, A, and BREGOLI, ALESSANDRO
Abstract: Sanità, finanza, telecomunicazioni, social network, e-commerce e sicurezza nazionale, sono alcuni esempi di ambiti del mondo reale in cui il sistema da studiare coinvolge diverse variabili il cui valore cambia nel tempo. Studiare tali sistemi consiste nel comprenderne il funzionamento, nel fare previsioni accurate sulla loro evoluzione nel tempo e di conseguenza nel prendere decisioni efficaci. In questo senso vengono tipicamente raccolte enormi quantità di dati misurando il valore di diverse variabili nel tempo, con l’obiettivo di modellare il processo di generazione dei dati sottostante, ovvero il processo che governa l’evoluzione del sistema oggetto di studio. Questi obiettivi ambiziosi, ovvero comprendere, prevedere e prendere decisioni efficaci, vengono perseguiti inserendo i dati raccolti in potenti algoritmi di intelligenza artificiale e apprendimento automatico per recuperare il processo di generazione dei dati sottostante. Questa tesi studia e analizza sistemi descritti da variabili a valori discreti, il cui valore cambia nel tempo continuo. In particolare vengono studiate le reti bayesiane a tempo continuo, un tipo di modello grafico probabilistico. Nello specifico, viene sviluppato il primo algoritmo basato su vincoli per apprendere la struttura di una rete bayesiana a tempo continuo dai dati disponibili insieme alla sua analisi della complessità computazionale. Questo algoritmo è ulteriormente esteso per affrontare il problema della classificazione di serie temporali multivariate in tempo continuo. Le reti bayesiane a tempo continuo vengono utilizzate per formulare e risolvere il problema dell'identificazione dello stato sentinella nel caso in cui sia nota la struttura del modello grafico probabilistico., Healthcare, finance, telecommunications, social networks, e-commerce and homeland security, are few instances of real world domains where the system to be studied involves several variables whose value changes over time. Studying such systems consists in understanding how they work, in making accurate predictions about their evolution over time, and consequently in making effective decisions. To this extent huge amount of data are typically collected by measuring the value of several variables over time, with the aim of modeling the underlying data-generating process, i.e., the process which rules the evolution of the system under study. These ambitious goals, i.e., understanding, predicting and making effective decisions, are pursued by feeding the collected data into powerful artificial intelligence and machine learning algorithms to recover the underlying data-generating process. This dissertation studies and analyzes systems described by discrete valued variables, whose value changes over continuous time. Particularly, continuous-time Bayesian networks, a type of probabilistic graphical model, are studied. The first constraint-based algorithm for learning the structure of a continuous time Bayesian network from the available data is developed together with its computational complexity analysis. This algorithm is further extended to tackle the problem of multivariate timeseries classification in continuous time. Continuous time Bayesian networks are used to formulate and solve the problem of sentry state identification in the case where the structure of the probabilistic graphical model is know.
Published: 2024

3. Creating Virtual Reality Scenarios for Pedestrian Experiments Focusing on Social Interactions

Author: Alderighi, M, Baldoni, M, Baroglio, C, Micalizio, R, Tedeschi, S, Briola, D, Tinti, F, Vizzari, G, Briola D., Tinti F., Vizzari G., Alderighi, M, Baldoni, M, Baroglio, C, Micalizio, R, Tedeschi, S, Briola, D, Tinti, F, Vizzari, G, Briola D., Tinti F., and Vizzari G.
Abstract: Designing and running real world pedestrian experiments can be complex, costly, and it can even have ethical implications. Virtual Reality can represent an alternative enabling the execution of experiments in a virtual environments, with synthetic humans (i.e. agents) interacting with human subjects. To achieve a high degree of realism, such virtual humans should behave realistically from many points of view: in this paper, we focus on how they move inside the environment. We propose the design, and first prototype, of a new tool based on Unity, simplifying the setup of realistic scenarios for experiments in VR with humans. In particular, this tool lets the modeler integrate external pathfinding models so as to achieve realistic and believable scenarios for experiments with human subjects.
Published: 2024

4. Deep learning in motor imagery EEG signal decoding: A Systematic Review

Author: Saibene, A, Ghaemi, H, Dagdevir, E, Saibene A., Ghaemi H., Dagdevir E., Saibene, A, Ghaemi, H, Dagdevir, E, Saibene A., Ghaemi H., and Dagdevir E.
Abstract: Thanks to the fast evolution of electroencephalography (EEG)-based brain-computer interfaces (BCIs) and computing technologies, as well as the availability of large EEG datasets, decoding motor imagery (MI) EEG signals is rapidly shifting from traditional machine learning (ML) to deep learning (DL) approaches. Furthermore, real-world MI-EEG BCI applications are progressively requiring higher generalization capabilities, which can be achieved by leveraging publicly available MI-EEG datasets and high-performance decoding models. Within this context, this paper provides a systematic review of DL approaches for MI-EEG decoding, focusing on studies that work on publicly available EEG-MI datasets. This review paper firstly provides a clear overview of these datasets that can be used for DL model training and testing. Afterwards, considering each dataset, related DL studies are discussed with respect to the four decoding paradigms identified in the literature, i.e., subject-dependent, subject-independent, transfer learning, and global decoding paradigms. Having analyzed the reviewed studies, the current trends and strategies, popular architectures, baseline models that are used for comprehensive analysis, and techniques to ensure reproducibility of the results in DL-based MI-EEG decoding are also identified and discussed. The selection and screening of the studies included in this review follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, leading to a comprehensive analysis of 394 papers published between January 1, 2017, and January 23, 2023.
Published: 2024

5. Triplétoile: Extraction of knowledge from microblogging text

Author: Zavarella, V, Consoli, S, Reforgiato Recupero, D, Fenu, G, Angioni, S, Buscaldi, D, Dessí, D, Osborne, F, Zavarella V., Consoli S., Reforgiato Recupero D., Fenu G., Angioni S., Buscaldi D., Dessí D., Osborne F., Zavarella, V, Consoli, S, Reforgiato Recupero, D, Fenu, G, Angioni, S, Buscaldi, D, Dessí, D, Osborne, F, Zavarella V., Consoli S., Reforgiato Recupero D., Fenu G., Angioni S., Buscaldi D., Dessí D., and Osborne F.
Abstract: Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open -domain entities and relations, typically found in these sources. In this paper, we propose an enhanced information extraction pipeline tailored to the extraction of a knowledge graph comprising open -domain entities from micro-blogging posts on social media platforms. Our pipeline leverages dependency parsing and classifies entity relations in an unsupervised manner through hierarchical clustering over word embeddings. We provide a use case on extracting semantic triples from a corpus of 100 thousand tweets about digital transformation and publicly release the generated knowledge graph. On the same dataset, we conduct two experimental evaluations, showing that the system produces triples with precision over 95% and outperforms similar pipelines of around 5% in terms of precision, while generating a comparatively higher number of triples.
Published: 2024

6. Neuroergonomic Attention Assessment in Safety-Critical Tasks: EEG Indices and Subjective Metrics Validation in a Novel Task-Embedded Reaction Time Paradigm

Author: Bjegojević, B, Pušica, M, Gianini, G, Gligorijević, I, Cromie, S, Leva, M, Bjegojević, Bojana, Pušica, Miloš, Gianini, Gabriele, Gligorijević, Ivan, Cromie, Sam, Leva, Maria Chiara, Bjegojević, B, Pušica, M, Gianini, G, Gligorijević, I, Cromie, S, Leva, M, Bjegojević, Bojana, Pušica, Miloš, Gianini, Gabriele, Gligorijević, Ivan, Cromie, Sam, and Leva, Maria Chiara
Abstract: Background/Objectives: This study addresses the gap in methodological guidelines for neuroergonomic attention assessment in safety-critical tasks, focusing on validating EEG indices, including the engagement index (EI) and beta/alpha ratio, alongside subjective ratings. Methods: A novel task-embedded reaction time paradigm was developed to evaluate the sensitivity of these metrics to dynamic attentional demands in a more naturalistic multitasking context. By manipulating attention levels through varying secondary tasks in the NASA MATB-II task while maintaining a consistent primary reaction-time task, this study successfully demonstrated the effectiveness of the paradigm. Results: Results indicate that both the beta/alpha ratio and El are sensitive to changes in attentional demands, with beta/alpha being more responsive to dynamic variations in attention, and El reflecting more the overall effort required to sustain performance, especially in conditions where maintaining attention is challenging. Conclusions: The potential for predicting the attention lapses through integration of performance metrics, EEG measures, and subjective assessments was demonstrated, providing a more nuanced understanding of dynamic fluctuations of attention in multitasking scenarios, mimicking those in real-world safety-critical tasks. These findings provide a foundation for advancing methods to monitor attention fluctuations accurately and mitigate risks in critical scenarios, such as train-driving or automated vehicle operation, where maintaining a high attention level is crucial.
Published: 2024

7. FILO: Automated FIx-LOcus Identification for Android Framework Compatibility Issues

Author: Mobilio, M, Riganelli, O, Micucci, D, Mariani, L, Mobilio M., Riganelli O., Micucci D., Mariani L., Mobilio, M, Riganelli, O, Micucci, D, Mariani, L, Mobilio M., Riganelli O., Micucci D., and Mariani L.
Abstract: Keeping up with the fast evolution of mobile operating systems is challenging for developers, who have to frequently adapt their apps to the upgrades and behavioral changes of the underlying API framework. Those changes often break backward compatibility. The consequence is that apps, if not updated, may misbehave and suffer unexpected crashes if executed within an evolved environment. Being able to quickly identify the portion of the app that should be modified to provide compatibility with new API versions can be challenging. To facilitate the debugging activities of problems caused by backward incompatible upgrades of the operating system, this paper presents FILO, a technique that is able to recommend the method that should be modified to implement the fix by analyzing a single failing execution. FILO can also provide additional information and key symptomatic anomalous events that can help developers understand the reason for the failure, therefore facilitating the implementation of the fix. We evaluated FILO against 18 real compatibility problems related to Android upgrades and compared it with Spectrum-Based Localization approaches. Results show that FILO is able to efficiently and effectively identify the fix-locus in the apps.
Published: 2024

8. The future of human and animal digital health platforms

Author: Bok, P, Micucci, D, Bok P. -B., Micucci D., Bok, P, Micucci, D, Bok P. -B., and Micucci D.
Abstract: Electronic Health (eHealth) has emerged as a pivotal driver of change in modern healthcare, reshaping the way medical information is collected, processed, and utilized. e-health includes digital solutions aimed at improving healthcare delivery, management, and accessibility. The Internet of Medical Things (IoMT) is specifically focused on establishing connections between medical devices and sensors to gather and transmit health-related data. Its primary objective is to enhance healthcare by facilitating real-time monitoring, employing data analytics, and integrating intelligent medical devices. The IoMT and, more broadly, eHealth are yielding positive outcomes, prompting their expanding application into the animal domain. Recent technological advancements facilitate the integration of health platforms, fostering a connection between human and animal health for improved well-being. This article introduces a conceptual framework that synthesizes the main activities in the medial data acquisition-processing pipeline. The framework has been derived from an analysis of the state of the art in the field of the IoMT in human healthcare. Furthermore, the article explores the application of eHealth concepts in the animal domain. Addressing both human and animal health, the paper summarizes the outstanding issues that need to be addressed for the full integration of these technologies into daily life.
Published: 2024

9. Collaborative Intelligence for Safety-Critical Industries: A Literature Review

Author: Ramos, I, Gianini, G, Leva, M, Damiani, E, Ramos, Inês F., Gianini, Gabriele, Leva, Maria Chiara, Damiani, Ernesto, Ramos, I, Gianini, G, Leva, M, Damiani, E, Ramos, Inês F., Gianini, Gabriele, Leva, Maria Chiara, and Damiani, Ernesto
Abstract: While AI-driven automation can increase the performance and safety of systems, humans should not be replaced in safety-critical systems but should be integrated to collaborate and mitigate each other’s limitations. The current trend in Industry 5.0 is towards human-centric collaborative paradigms, with an emphasis on collaborative intelligence (CI) or Hybrid Intelligent Systems. In this survey, we search and review recent work that employs AI methods for collaborative intelligence applications, specifically those that focus on safety and safety-critical industries. We aim to contribute to the research landscape and industry by compiling and analyzing a range of scenarios where AI can be used to achieve more efficient human–machine interactions, improved collaboration, coordination, and safety. We define a domain-focused taxonomy to categorize the diverse CI solutions, based on the type of collaborative interaction between intelligent systems and humans, the AI paradigm used and the domain of the AI problem, while highlighting safety issues. We investigate 91 articles on CI research published between 2014 and 2023, providing insights into the trends, gaps, and techniques used, to guide recommendations for future research opportunities in the fast developing collaborative intelligence field.
Published: 2024

10. A comparative analysis of knowledge injection strategies for large language models in the scholarly domain

Author: Cadeddu, A, Chessa, A, De Leo, V, Fenu, G, Motta, E, Osborne, F, Reforgiato Recupero, D, Salatino, A, Secchi, L, Cadeddu A., Chessa A., De Leo V., Fenu G., Motta E., Osborne F., Reforgiato Recupero D., Salatino A., Secchi L., Cadeddu, A, Chessa, A, De Leo, V, Fenu, G, Motta, E, Osborne, F, Reforgiato Recupero, D, Salatino, A, Secchi, L, Cadeddu A., Chessa A., De Leo V., Fenu G., Motta E., Osborne F., Reforgiato Recupero D., Salatino A., and Secchi L.
Abstract: In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers.
Published: 2024

11. Preface for Joint Proceedings of Posters, Demos, Workshops, and Tutorials of SEMANTiCS 2024

Author: Garijo, D, Gentile, AL, Kurteva, A, Mannocci, A, Osborne, F, Vahdati, S, Gentile, A, Garijo D., Gentile A. L., Kurteva A., Mannocci A., Osborne F., Vahdati S., Garijo, D, Gentile, AL, Kurteva, A, Mannocci, A, Osborne, F, Vahdati, S, Gentile, A, Garijo D., Gentile A. L., Kurteva A., Mannocci A., Osborne F., and Vahdati S.
Published: 2024

12. Preface for the Third International Workshop on Knowledge Graph Generation from Text

Author: Tiwari, S, Mihindukulasooriya, N, Osborne, F, Kontokostas, D, D’Souza, J, Kejriwal, M, Pellegrino, MA, Rula, A, Labra Gayo, JE, Cochez, M, Alam, M, D'Souza, J, Tiwari S., Mihindukulasooriya N., Osborne F., Kontokostas D., D'Souza J., Kejriwal M., Tiwari, S, Mihindukulasooriya, N, Osborne, F, Kontokostas, D, D’Souza, J, Kejriwal, M, Pellegrino, MA, Rula, A, Labra Gayo, JE, Cochez, M, Alam, M, D'Souza, J, Tiwari S., Mihindukulasooriya N., Osborne F., Kontokostas D., D'Souza J., and Kejriwal M.
Published: 2024

13. Proceedings of the 2nd International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data co-located with the Extended Semantic Web Conference 2024

Author: Dessi R., Dessi D., Osborne F., Aras H., Dessi, R, Dessi, D, Osborne, F, Aras, H, Dessi R., Dessi D., Osborne F., Aras H., Dessi, R, Dessi, D, Osborne, F, and Aras, H
Published: 2024

14. Automating Citation Placement with Natural Language Processing and Transformers

Author: Dessi, R, Dessi, D, Osborne, F, Aras, H, Buscaldi, D, Motta, E, Murgia, M, Recupero, D, Buscaldi D., Dessi D., Motta E., Murgia M., Osborne F., Recupero D. R., Dessi, R, Dessi, D, Osborne, F, Aras, H, Buscaldi, D, Motta, E, Murgia, M, Recupero, D, Buscaldi D., Dessi D., Motta E., Murgia M., Osborne F., and Recupero D. R.
Abstract: In scientific writing, references are crucial in supporting claims, spotlighting evidence, and highlighting research gaps. However, where to add a reference and which reference to cite are subjectively chosen by the papers’ authors; thus the automation of the task is challenging and requires proper investigations. This paper focuses on the automatic placement of references, considering its diverse approaches depending on writing style and community norms, and investigates the use of transformers and Natural Language Processing heuristics to predict i) if a reference is needed in a scientific statement, and ii) where the reference should be placed within the statement. For this investigation, this paper investigates two techniques, namely Mask-filling (MF) and Named Entity Recognition (NER), and provides insights on how to solve this task.
Published: 2024

15. SemEval-2024 Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

Author: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá, Mickus, T, Zosa, E, Vazquez, R, Vahtola, T, Tiedemann, J, Segonne, V, Raganato, A, Apidianaki, M, Mickus, Timothee, Zosa, Elaine, Vazquez, Raul, Vahtola, Teemu, Tiedemann, Jörg, Segonne, Vincent, Raganato, Alessandro, Apidianaki, Marianna, Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá, Mickus, T, Zosa, E, Vazquez, R, Vahtola, T, Tiedemann, J, Segonne, V, Raganato, A, Apidianaki, M, Mickus, Timothee, Zosa, Elaine, Vazquez, Raul, Vahtola, Teemu, Tiedemann, Jörg, Segonne, Vincent, Raganato, Alessandro, and Apidianaki, Marianna
Abstract: This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate. Such cases of overgeneration put in jeopardy many NLG applications, where correctness is often mission-critical. The shared task was conducted with a newly constructed dataset of 4000 model outputs labeled by 5 annotators each, spanning 3 NLP tasks: machine translation, paraphrase generation and definition modeling.The shared task was tackled by a total of 58 different users grouped in 42 teams, out of which 26 elected to write a system description paper; collectively, they submitted over 300 prediction sets on both tracks of the shared task. We observe a number of key trends in how this approach was tackled—many participants rely on a handful of model, and often rely either on synthetic data for fine-tuning or zero-shot prompting strategies. While a majority of the teams did outperform our proposed baseline system, the performances of top-scoring systems are still consistent with a random handling of the more challenging items.
Published: 2024

16. Investigating Environmental, Social, and Governance (ESG) Discussions in News: A Knowledge Graph Analysis Empowered by AI

Author: Dessi, R, Dessi, D, Osborne, F, Aras, H, Angioni, S, Consoli, S, Recupero, D, Salatino, A, Angioni S., Consoli S., Dessi D., Osborne F., Recupero D. R., Salatino A., Dessi, R, Dessi, D, Osborne, F, Aras, H, Angioni, S, Consoli, S, Recupero, D, Salatino, A, Angioni S., Consoli S., Dessi D., Osborne F., Recupero D. R., and Salatino A.
Abstract: This paper explores the growing importance of Environmental, Social, and Governance (ESG) criteria in financial assessments and conducts an AI-driven analysis of ESG concepts’ evolution from 1980 to 2022. Focusing on media sources from the United States and the United Kingdom, the study utilizes the Dow Jones News Article dataset for a comprehensive analysis focused on the environmental domain. The research introduces an innovative information extraction technique, transforming extracted data into a knowledge graph. Key findings highlight recent trends in ESG aspects, with a notable emphasis on climate change, renewable energy sources, and biodiversity conservation in the environmental dimension.
Published: 2024

17. Leveraging Language Models for Generating Ontologies of Research Topics

Author: Pisu, A, Pompianu, L, Salatino, A, Osborne, F, Riboni, D, Motta, E, Recupero, D, Pisu A., Pompianu L., Salatino A., Osborne F., Riboni D., Motta E., Recupero D. R., Pisu, A, Pompianu, L, Salatino, A, Osborne, F, Riboni, D, Motta, E, Recupero, D, Pisu A., Pompianu L., Salatino A., Osborne F., Riboni D., Motta E., and Recupero D. R.
Abstract: The current generation of artificial intelligence technologies, such as smart search engines, recommendation systems, tools for systematic reviews, and question-answering applications, plays a crucial role in helping researchers manage and interpret scientific literature. Taxonomies and ontologies of research topics are a fundamental part of this environment as they allow intelligent systems and scientists to navigate the ever-growing number of research papers. However, creating these classifications manually is an expensive and time-consuming process, often resulting in outdated and coarse-grained representations. Consequently, researchers have been focusing on developing automated or semi-automated methods to create taxonomies of research topics. This paper studies the application of transformer-based language models for generating research topic ontologies. Specifically, we have developed a model leveraging SciBERT to identify four semantic relationships between research topics (supertopic, subtopic, same-as, and other) and conducted a comparative analysis against alternative solutions. The preliminary findings indicate that the transformer-based model significantly surpasses the performance of models reliant on traditional features.
Published: 2024

18. Classifying Scientific Topic Relationships with SciBERT

Author: Garijo, D, Gentile, AL, Kurteva, A, Mannocci, A, Osborne, F, Vahdati, S, Pisu, A, Pompianu, L, Salatino, A, Riboni, D, Motta, E, Recupero, D, Pisu A., Pompianu L., Salatino A., Osborne F., Riboni D., Motta E., Recupero D. R., Garijo, D, Gentile, AL, Kurteva, A, Mannocci, A, Osborne, F, Vahdati, S, Pisu, A, Pompianu, L, Salatino, A, Riboni, D, Motta, E, Recupero, D, Pisu A., Pompianu L., Salatino A., Osborne F., Riboni D., Motta E., and Recupero D. R.
Abstract: Current AI systems, including smart search engines and recommendation systems tools for streamlining literature reviews, and interactive question-answering platforms, are becoming indispensable for researchers to navigate and understand the vast landscape of scientific knowledge.Taxonomies and ontologies of research topics are key to this process, but manually creating them is costly and often leads to outdated results.This poster paper shows the use of SciBERT model to automatically generate research topic ontologies.Our model excels at identifying semantic relationships between research topics, outperforming traditional methods.This approach promises to streamline the creation of accurate and up-to-date ontologies, enhancing the effectiveness of AI tools for researchers.
Published: 2024

19. Automating Gender-Inclusive Language Modification in Italian University Administrative Documents

Author: Cerabolini, A, Pasi, G, Viviani, M, Cerabolini, Aurora, Pasi, Gabriella, Viviani, Marco, Cerabolini, A, Pasi, G, Viviani, M, Cerabolini, Aurora, Pasi, Gabriella, and Viviani, Marco
Abstract: In this work, we address the issue of automating the identification of non-inclusive language in administrative documents of Italian universities as well as providing gender-inclusive corrections. To achieve this objective, data from various Italian universities were gathered, leading to the creation of a dictionary containing potentially non-inclusive terms, and of a dataset containing gender non-inclusive sentences and their corresponding inclusive versions. Subsequently, three distinct approaches have been defined and evaluated: a rule-based and two neural approaches. In the development of the rule-based approach, Italian Part-of-Speech tagging, dependency parsing, and morphologization techniques were employed to detect masculine trigger words within sentences, ascertain whether they functioned as generic masculine terms, and offer gender-inclusive alternatives. In contrast, for the implementation of the two neural approaches, both the mT5 model and ChatGPT were utilized, and their respective outputs were compared against the rewritten sentences they generated. The experimental evaluations conducted suggest the effectiveness of the proposed solutions.
Published: 2024

20. Knowledge Graphs for Digital Transformation Monitoring in Social Media

Author: Zavarella, V, Recupero, D, Consoli, S, Fenu, G, Angioni, S, Buscaldi, D, Dessi, D, Osborne, F, Zavarella V., Recupero D. R., Consoli S., Fenu G., Angioni S., Buscaldi D., Dessi D., Osborne F., Zavarella, V, Recupero, D, Consoli, S, Fenu, G, Angioni, S, Buscaldi, D, Dessi, D, Osborne, F, Zavarella V., Recupero D. R., Consoli S., Fenu G., Angioni S., Buscaldi D., Dessi D., and Osborne F.
Abstract: Several techniques and workflows have emerged recently for automatically extracting knowledge graphs from documents like scientific articles and patents. However, adapting these approaches to integrate alternative text sources such as micro-blogging posts and news and to model open-domain entities and relationships commonly found in these sources is still challenging. This paper introduces an improved information extraction pipeline designed specifically for extracting a knowledge graph comprising open-domain entities from micro-blogging posts on social media platforms. Our pipeline utilizes dependency parsing and employs unsupervised classification of entity relations through hierarchical clustering over word embeddings. We present a case study involving the extraction of semantic triples from a tweet collection concerning digital transformation and show through two experimental evaluations on the same dataset that our system achieves precision rates exceeding 95% and surpasses similar pipelines by approximately 5% in terms of precision, while also generating a notably higher number of triples.
Published: 2024

21. Integrated visualization of metabolomics and transcriptomics with Galaxy

Author: Ferrari, M, Lapi, F, Penati, L, Vanoni, M, Galuzzi, B, Damiani, C, Ferrari, M, Lapi, F, Penati, L, Vanoni, M, Galuzzi, B, and Damiani, C
Abstract: The regulation of cell metabolism is complex and multifold. Hence the metabolic alterations that have been reported in many physio-pathological conditions can be fully charac- terized only by using model-based multi-omics data integration frameworks. We present here version 2 of the Marea4Galaxy tool, integrated into the Galaxy platform. The previous version of Marea4Galaxy allowed users to visualize deregulated reactions at the transcriptomic level. The new version extends these capabilities by enabling the simultaneous visualization of dereg- ulated reactions at the metabolic level using metabolomics data. Significant improvements have been made, including a more comprehensive metabolic network model, a module for extract- ing necessary inputs from any metabolic model in XML or JSON format, better compatibility with alternative gene nomenclatures, and faster reaction activity scores (RASs) calculation. We demonstrate the utility of this tool by comparing different groups of cancer cell lines using paired datasets from the Cancer Cell Line Encyclopedia.
Published: 2024

22. Preface of the Workshop on Deep Learning and Large Language Models for Knowledge Graphs (DL4KG)

Author: Alam, M, Buscaldi, D, Reforgiato Recupero, D, Cochez, M, Gesese, G, Osborne, F, Alam M., Buscaldi D., Reforgiato Recupero D., Cochez M., Gesese G. A., Osborne F., Alam, M, Buscaldi, D, Reforgiato Recupero, D, Cochez, M, Gesese, G, Osborne, F, Alam M., Buscaldi D., Reforgiato Recupero D., Cochez M., Gesese G. A., and Osborne F.
Abstract: The use of Knowledge Graphs (KGs) which constitute large networks of real-world entities and their interrelationships, has grown rapidly. A substantial body of research has emerged, exploring the integration of deep learning (DL) and large language models (LLMs) with KGs. This workshop aims to bring together leading researchers in the field to discuss and foster collaborations on the intersection of KG and DL/LLMs.
Published: 2024

23. Large Language Models for Scientific Question Answering: An Extensive Analysis of the SciQA Benchmark

Author: Meroño Peñuela, A, Dimou, A, Troncy, R, Hartig, O, Acosta, M, Alam, M, Paulheim, H, Lisena, P, Lehmann, J, Meloni, A, Motta, E, Osborne, F, Recupero, D, Salatino, A, Vahdati, S, Lehmann J., Meloni A., Motta E., Osborne F., Recupero D. R., Salatino A. A., Vahdati S., Meroño Peñuela, A, Dimou, A, Troncy, R, Hartig, O, Acosta, M, Alam, M, Paulheim, H, Lisena, P, Lehmann, J, Meloni, A, Motta, E, Osborne, F, Recupero, D, Salatino, A, Vahdati, S, Lehmann J., Meloni A., Motta E., Osborne F., Recupero D. R., Salatino A. A., and Vahdati S.
Abstract: The SciQA benchmark for scientific question answering aims to represent a challenging task for next-generation question-answering systems on which vanilla large language models fail. In this article, we provide an analysis of the performance of language models on this benchmark including prompting and fine-tuning techniques to adapt them to the SciQA task. We show that both fine-tuning and prompting techniques with intelligent few-shot selection allow us to obtain excellent results on the SciQA benchmark. We discuss the valuable lessons and common error categories, and outline their implications on how to optimise large language models for question answering over knowledge graphs.
Published: 2024

24. Ontology-Based Generation of Data Platform Assets

Author: He, J, Palpanas, T, Hu, X, Cuzzocrea, A, Dou, D, Slezak, D, Wang, W, Gruca, A, Lin, JCW, Agrawal, R, De Leo, V, Fenu, G, Greco, D, Bidotti, N, Platter, P, Motta, E, Nuzzolese, A, Osborne, F, Recupero, D, De Leo V., Fenu G., Greco D., Bidotti N., Platter P., Motta E., Nuzzolese A. G., Osborne F., Recupero D. R., He, J, Palpanas, T, Hu, X, Cuzzocrea, A, Dou, D, Slezak, D, Wang, W, Gruca, A, Lin, JCW, Agrawal, R, De Leo, V, Fenu, G, Greco, D, Bidotti, N, Platter, P, Motta, E, Nuzzolese, A, Osborne, F, Recupero, D, De Leo V., Fenu G., Greco D., Bidotti N., Platter P., Motta E., Nuzzolese A. G., Osborne F., and Recupero D. R.
Abstract: The design and management of modern big data platforms are extremely complex. It requires carefully integrating multiple storage and computational platforms as well as implementing approaches to protect and audit data access. Therefore, onboarding new data and implementing new data transformation processes is typically time-consuming and expensive. In many cases, enterprises construct their data platforms without a clear distinction between logical and technical concerns. Consequently, these platforms lack sufficient abstraction and are closely tied to particular technologies, making the adaptation to technological evolution very costly. This paper illustrates a novel approach to designing data platform models based on a formal ontology that structures various domain components into an accessible knowledge graph. We also describe the preliminary version of AGILE-DM, a novel ontology that we built for this purpose. Our solution is flexible, technologically agnostic, and more adaptable to changes and technical advancements.
Published: 2024

25. Bootstrap Your Conversions: Thompson Sampling for Partially Observable Delayed Rewards

Author: Stella, F, Gigli, M, Fabio Stella, Marco Gigli, Stella, F, Gigli, M, Fabio Stella, and Marco Gigli
Published: 2024

26. Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting

Author: Aggarwal, T, Salatino, A, Osborne, F, Motta, E, Aggarwal T., Salatino A., Osborne F., Motta E., Aggarwal, T, Salatino, A, Osborne, F, Motta, E, Aggarwal T., Salatino A., Osborne F., and Motta E.
Abstract: Knowledge Organization Systems (KOS), such as ontologies, taxonomies, and thesauri, play a crucial role in organising scientific knowledge. They help scientists navigate the vast landscape of research literature and are essential for building intelligent systems such as smart search engines, recommendation systems, conversational agents, and advanced analytics tools. However, the manual creation of these KOSs is costly, time-consuming, and often leads to outdated and overly broad representations. As a result, researchers have been exploring automated or semi-automated methods for generating ontologies of research topics. This paper analyses the use of large language models (LLMs) to identify semantic relationships between research topics. We specifically focus on six open and lightweight LLMs (up to 10.7 billion parameters) and use two zero-shot reasoning strategies to identify four types of relationships: broader, narrower, same-as, and other. Our preliminary analysis indicates that Dolphin2.1-OpenOrca-7B performs strongly in this task, achieving a 0.853 F1-score against a gold standard of 1,000 relationships derived from the IEEE Thesaurus. These promising results bring us one step closer to the next generation of tools for automatically curating KOSs, ultimately making the scientific literature easier to explore.
Published: 2024

27. Raccolta di dati da un impianto di laminazione. Proposta d’un Accordo di collaborazione

Author: Crosta Giovanni Franco, Crosta, G, Crosta Giovanni Franco, Crosta Giovanni Franco, Crosta, G, and Crosta Giovanni Franco
Abstract: Un edificio sito in Gallarate ed in regime di condominio è dotato di un impianto a serbatoi per la laminazione delle precipitazioni intense (nel seguito “l’Impianto”) sito al piano I interrato e costruito nel Giugno 2019. L’Impianto funge da scaricatore di piena durante i nubifragi, evitando risalite d’acqua verso unità abitative al piano T. La statistica delle precipitazioni locali estreme (durata di pochi minuti, intensità di alcuni mm) non è generalmente conosciuta. In commercio esistono idrometri registratori alimentati a batteria e programmabili, che misurano e memorizzano valori di colonna d’acqua e temperatura. Lo scrivente dal 2017 progetta, realizza e gestisce reti di tali strumenti ed applica metodi matematici idonei a trattare ed interpretare i corrispondenti dati. L’idrometria a registrazione dell’Impianto apporterebbe, a costo zero, informazione sull’idrologia degli eventi estremi almeno a livello locale. Questa presentazione ha carattere pre-progettuale (nel senso di un’Analyse PÉSTEL e della UNI ISO EN 21502:2021) e sostiene l’opportunità d’un Accordo di Collaborazione fra lo scrivente ed il Condominio per la raccolta e l’elaborazione di dati rilevati nell’Impianto. Applicando una stima per analogia con eventi registrati da altra rete strumentale operativa a breve distanza (<500m) dall'edificio, vengono anticipati i risultati potenzialmente ottenibili nell’ambito dell’Accordo.
Published: 2024

28. Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop

Author: Tsaneva, S, Dessi, D, Osborne, F, Sabou, M, Tsaneva S., Dessi D., Osborne F., Sabou M., Tsaneva, S, Dessi, D, Osborne, F, Sabou, M, Tsaneva S., Dessi D., Osborne F., and Sabou M.
Abstract: Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a comprehensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in-the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%).
Published: 2024

29. Visual Analytics for Sustainable Mobility: Usability Evaluation and Knowledge Acquisition for Mobility-as-a-Service (MaaS) Data Exploration

Author: Delfini, L, Spahiu, B, Vizzari, G, Delfini, Lorenzo, Spahiu, Blerina, Vizzari, Giuseppe, Delfini, L, Spahiu, B, Vizzari, G, Delfini, Lorenzo, Spahiu, Blerina, and Vizzari, Giuseppe
Abstract: Urban mobility systems generate a massive volume of real-time data, providing an exceptional opportunity to understand and optimize transportation networks. To harness this potential, we developed UrbanFlow Milano, an interactive map-based dashboard designed to explore the intricate patterns of shared mobility use within the city of Milan. By placing users at the center of the analysis, UrbanFlow empowers them to visualize, filter, and interact with data to uncover valuable insights. Through a comprehensive user study, we observed how individuals interact with the dashboard, gaining critical feedback to refine its design and enhance its effectiveness. Our research contributes to the advancement of user-centric visual analytics tools that facilitate data-driven decision-making in urban planning and transportation management.
Published: 2024

30. Artificial intelligence for literature reviews: opportunities and challenges

Author: Bolanos, F, Salatino, A, Osborne, F, Motta, E, Bolanos F., Salatino A., Osborne F., Motta E., Bolanos, F, Salatino, A, Osborne, F, Motta, E, Bolanos F., Salatino A., Osborne F., and Motta E.
Abstract: This paper presents a comprehensive review of the use of Artificial Intelligence (AI) in Systematic Literature Reviews (SLRs). A SLR is a rigorous and organised methodology that assesses and integrates prior research on a given topic. Numerous tools have been developed to assist and partially automate the SLR process. The increasing role of AI in this field shows great potential in providing more effective support for researchers, moving towards the semi-automatic creation of literature reviews. Our study focuses on how AI techniques are applied in the semi-automation of SLRs, specifically in the screening and extraction phases. We examine 21 leading SLR tools using a framework that combines 23 traditional features with 11 AI features. We also analyse 11 recent tools that leverage large language models for searching the literature and assisting academic writing. Finally, the paper discusses current trends in the field, outlines key research challenges, and suggests directions for future research. We highlight three primary research challenges: integrating advanced AI solutions, such as large language models and knowledge graphs, improving usability, and developing a standardised evaluation framework. We also propose best practices to ensure more robust evaluations in terms of performance, usability, and transparency. Overall, this review offers a detailed overview of AI-enhanced SLR tools for researchers and practitioners, providing a foundation for the development of next-generation AI solutions in this field.
Published: 2024

31. Optimizing Tourism Accommodation Offers by Integrating Language Models and Knowledge Graph Technologies

Author: Cadeddu, A, Chessa, A, De Leo, V, Fenu, G, Motta, E, Osborne, F, Reforgiato Recupero, D, Salatino, A, Secchi, L, Cadeddu A., Chessa A., De Leo V., Fenu G., Motta E., Osborne F., Reforgiato Recupero D., Salatino A., Secchi L., Cadeddu, A, Chessa, A, De Leo, V, Fenu, G, Motta, E, Osborne, F, Reforgiato Recupero, D, Salatino, A, Secchi, L, Cadeddu A., Chessa A., De Leo V., Fenu G., Motta E., Osborne F., Reforgiato Recupero D., Salatino A., and Secchi L.
Abstract: Online platforms have become the primary means for travellers to search, compare, and book accommodations for their trips. Consequently, online platforms and revenue managers must acquire a comprehensive comprehension of these dynamics to formulate a competitive and appealing offerings. Recent advancements in natural language processing, specifically through the development of large language models, have demonstrated significant progress in capturing the intricate nuances of human language. On the other hand, knowledge graphs have emerged as potent instruments for representing and organizing structured information. Nevertheless, effectively integrating these two powerful technologies remains an ongoing challenge. This paper presents an innovative deep learning methodology that combines large language models with domain-specific knowledge graphs for classification of tourism offers. The main objective of our system is to assist revenue managers in the following two fundamental dimensions: (i) comprehending the market positioning of their accommodation offerings, taking into consideration factors such as accommodation price and availability, together with user reviews and demand, and (ii) optimizing presentations and characteristics of the offerings themselves, with the intention of improving their overall appeal. For this purpose, we developed a domain knowledge graph covering a variety of information about accommodations and implemented targeted feature engineering techniques to enhance the information representation within a large language model. To evaluate the effectiveness of our approach, we conducted a comparative analysis against alternative methods on four datasets about accommodation offers in London. The proposed solution obtained excellent results, significantly outperforming alternative methods.
Published: 2024

32. 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment

Author: Salatino A., Mannocci A., Osborne F., Rehm G., Schimmler S., Salatino, A, Mannocci, A, Osborne, F, Rehm, G, Schimmler, S, Salatino A., Mannocci A., Osborne F., Rehm G., Schimmler S., Salatino, A, Mannocci, A, Osborne, F, Rehm, G, and Schimmler, S
Published: 2024

33. Assessing AI-Based Code Assistants in Method Generation Tasks

Author: Corso, V, Mariani, L, Micucci, D, Riganelli, O, Corso V., Mariani L., Micucci D., Riganelli O., Corso, V, Mariani, L, Micucci, D, Riganelli, O, Corso V., Mariani L., Micucci D., and Riganelli O.
Abstract: AI-based code assistants are increasingly popular as a means to enhance productivity and improve code quality. This study compares four AI-based code assistants, GitHub Copilot, Tabnine, ChatGPT, and Google Bard, in method generation tasks, assessing their ability to produce accurate, correct, and efficient code. Results show that code assistants are useful, with complementary capabilities, although they rarely generate ready-to-use correct code.
Published: 2024

34. Analyzing Prompt Influence on Automated Method Generation: An Empirical Study with Copilot

Author: Fagadau, I, Mariani, L, Micucci, D, Riganelli, O, Fagadau, Ionut Daniel, Mariani, Leonardo, Micucci, Daniela, Riganelli, Oliviero, Fagadau, I, Mariani, L, Micucci, D, Riganelli, O, Fagadau, Ionut Daniel, Mariani, Leonardo, Micucci, Daniela, and Riganelli, Oliviero
Abstract: Generative AI is changing the way developers interact with software systems, providing services that can produce and deliver new content, crafted to satisfy the actual needs of developers. For instance, developers can ask for new code dtrectly from within their IDEs by writing natural language prompts, and integrated services based on generative AI, such as Copilot, immediately respond to prompts by providing ready-to-use code snippets. Formulating the prompt appropriately, and incorporating the useful information while avoiding any information overload, can be an important factor in obtaining the right piece of code. The task of designing good prompts is known as prompt engineering. In this paper, we systematically investigate the influence of eight prompt features on the style and the content of prompts, on the level of correctness, complexity, size, and similarity to the developers' code of the generated code. We specifically consider the task of using Copilot with 124,800 prompts obtained by systematically combining the eight considered prompt features to generate the implementation of 200 Java methods. Results show how some prompt features, such as the presence of examples and the summary of the purpose of the method, can significantly influence the quality of the result. Ccs Concepts • Software and its engineering →Integrated and visual development environments.
Published: 2024

35. Generating Java Methods: An Empirical Assessment of Four AI-Based Code Assistants

Author: Corso, V, Mariani, L, Micucci, D, Riganelli, O, Corso, Vincenzo, Mariani, Leonardo, Micucci, Daniela, Riganelli, Oliviero, Corso, V, Mariani, L, Micucci, D, Riganelli, O, Corso, Vincenzo, Mariani, Leonardo, Micucci, Daniela, and Riganelli, Oliviero
Abstract: AI-based code assistants are promising tools that can facilitate and speed up code development. They exploit machine learning algorithms and natural language processing to interact with developers, suggesting code snippets (e.g., method implementations) that can be incorporated into projects. Recent studies empirically investigated the effectiveness of code assistants using simple exemplary problems (e.g., the re-implementation of well-known algorithms), which fail to capture the spectrum and nature of the tasks actually faced by developers. In this paper, we expand the knowledge in the area by comparatively assessing four popular AI-based code assistants, namely GitHub Copilot, Tabnine, ChatGPT, and Google Bard, with a dataset of 100 methods that we constructed from real-life open-source Java projects, considering a variety of cases for complexity and dependency from contextual elements. Results show that Copilot is often more accurate than other techniques, yet none of the assistants is completely subsumed by the rest of the approaches. Interestingly, the effectiveness of these solutions dramatically decreases when dealing with dependencies outside the boundaries of single classes.
Published: 2024

36. Measuring Software Testability via Automatically Generated Test Cases

Author: Guglielmo, L, Mariani, L, Denaro, G, Guglielmo, L, Mariani, L, and Denaro, G
Abstract: Estimating software testability can crucially assist software managers to optimize test budgets and software quality. In this paper, we propose a new approach that radically differs from the traditional approach of pursuing testability measurements based on software metrics, e.g., the size of the code or the complexity of the designs. Our approach exploits automatic test generation and mutation analysis to quantify the evidence about the relative hardness of developing effective test cases. In the paper, we elaborate on the intuitions and the methodological choices that underlie our proposal for estimating testability, introduce a technique and a prototype that allows for concretely estimating testability accordingly, and discuss our findings out of a set of experiments in which we compare the performance of our estimations both against and in combination with traditional software metrics. The results show that our testability estimates capture a complementary dimension of testability that can be synergistically combined with approaches based on software metrics to improve the accuracy of predictions.
Published: 2024

37. Exploring Environmental, Social, and Governance (ESG) Discourse in News: An AI-Powered Investigation through Knowledge Graph Analysis

Author: Angioni, S, Consoli, S, Dessi, D, Osborne, F, Recupero, D, Salatino, A, Angioni S., Consoli S., Dessi D., Osborne F., Recupero D. R., Salatino A., Angioni, S, Consoli, S, Dessi, D, Osborne, F, Recupero, D, Salatino, A, Angioni S., Consoli S., Dessi D., Osborne F., Recupero D. R., and Salatino A.
Abstract: In recent years, the significance of Environmental, Social, and Governance criteria in assessing financial investments has grown significantly. This paper presents an AI-driven analysis of ESG concepts and their evolution from 1980 to 2022, with a specific focus on media sources from the United States and the United Kingdom. The primary data source utilized is the Dow Jones News Article dataset, providing a comprehensive and high-quality collection of news articles. The study introduces a novel technique for information extraction from news articles, involving the structuring of extracted data into a knowledge graph. The findings identified key trends associated with ESG aspects emerging in recent years. In the environmental dimension, we identified a pronounced emphasis on climate change, renewable energy sources, and biodiversity conservation. Within the social aspect, the analysis pointed out the increasing significance of issues such as racism, gender identity, and human rights, as well as the increasing role of charities, and the ethical challenges of modern supply chains. Finally, in the governance domain, the findings emphasized issues related to corporate governance accountability, workplace ethics, and the conduct and remuneration of executives.
Published: 2024

38. Deep Representation Learning for Open Vocabulary Electroencephalography-to-Text Decoding

Author: Amrani, H, Micucci, D, Napoletano, P, Amrani H., Micucci D., Napoletano P., Amrani, H, Micucci, D, Napoletano, P, Amrani H., Micucci D., and Napoletano P.
Abstract: Previous research has demonstrated the potential of using pre-trained language models for decoding open vocabulary Electroencephalography (EEG) signals captured through a non-invasive Brain-Computer Interface (BCI). However, the impact of embedding EEG signals in the context of language models and the effect of subjectivity, remain unexplored, leading to uncertainty about the best approach to enhance decoding performance. Additionally, current evaluation metrics used to assess decoding effectiveness are predominantly syntactic and do not provide insights into the comprehensibility of the decoded output for human understanding. We present an end-to-end architecture for non-invasive brain recordings that brings modern representational learning approaches to neuroscience. Our proposal introduces the following innovations: 1) an end-to-end deep learning architecture for open vocabulary EEG decoding, incorporating a subject-dependent representation learning module for raw EEG encoding, a BART language model, and a GPT-4 sentence refinement module; 2) a more comprehensive sentence-level evaluation metric based on the BERTScore; 3) an ablation study that analyses the contributions of each module within our proposal, providing valuable insights for future research. We evaluate our approach on two publicly available datasets, ZuCo v1.0 and v2.0, comprising EEG recordings of 30 subjects engaged in natural reading tasks. Our model achieves a BLEU-1 score of 42.75%, a ROUGE-1-F of 33.28%, and a BERTScore-F of 53.86%, achieving an increment over the previous state-of-the-art by 1.40%, 2.59%, and 3.20%, respectively. IEEE
Published: 2024

39. Certifying Accuracy, Privacy, and Robustness of ML-Based Malware Detection

Author: Bena, N, Anisetti, M, Gianini, G, Ardagna, C, Bena, Nicola, Anisetti, Marco, Gianini, Gabriele, Ardagna, Claudio A., Bena, N, Anisetti, M, Gianini, G, Ardagna, C, Bena, Nicola, Anisetti, Marco, Gianini, Gabriele, and Ardagna, Claudio A.
Abstract: Recent advances in artificial intelligence (AI) are radically changing how systems and applications are designed and developed. In this context, new requirements and regulations emerge, such as the AI Act, placing increasing focus on strict non-functional requirements, such as privacy and robustness, and how they are verified. Certification is considered the most suitable solution for non-functional verification of modern distributed systems, and is increasingly pushed forward in the verification of AI-based applications. In this paper, we present a novel dynamic malware detector driven by the requirements in the AI Act, which goes beyond standard support for high accuracy, and also considers privacy and robustness. Privacy aims to limit the need of malware detectors to examine the entire system in depth requiring administrator-level permissions; robustness refers to the ability to cope with malware mounting evasion attacks to escape detection. We then propose a certification scheme to evaluate non-functional properties of malware detectors, which is used to comparatively evaluate our malware detector and two representative deep-learning solutions in literature.
Published: 2024

40. Artificial Intelligence and Gender-Fair Language in School Books: Pedagogical Insights from Potentialities of Using An Autocorrect in Education.

Author: Persico, G, Rosola, M, Frenda, S, Greta Persico, Martina Rosola, Simona Frenda, Persico, G, Rosola, M, Frenda, S, Greta Persico, Martina Rosola, and Simona Frenda
Abstract: This paper reflects the application in education of an autocorrect for Italian designed to make it easier to adopt gender-fair language consistently in administrative documents. Sexism in Italian produces important effects in educational contexts as confirmed, for example, by research analyzing school texts. The literature highlights how certain gendered expressions influence our cognition, and how masculine terms evoke masculine images with the effect of excluding, depowering and make invisible women, non-binary and trans people. The analysis of sexism in Italian is thorough and several national and international bodies issued gender-fair language guidelines, that constitute a vast know-that on the subject. However, there is a lack of operational tools to facilitate their implementation, bridging the gap between the know-that and the know-how. This contribution aims to explore, from an interdisciplinary perspective, the pedagogical and training potentialities arising from the use of a gender-fair autocorrect in education. In particular, we argue that the benefit of such a tool is twofold: on the one hand, it produces fair texts and, on the other, it helps its users to develop the ability to recognize and replace sexist expressions.
Published: 2024

41. RecGraph: recombination-aware alignment of sequences to variation graphs

Author: Avila Cartes, J, Bonizzoni, P, Ciccolella, S, Della Vedova, G, Denti, L, Didelot, X, Monti, D, Pirola, Y, Avila Cartes J., Bonizzoni P., Ciccolella S., Della Vedova G., Denti L., Didelot X., Monti D. C., Pirola Y., Avila Cartes, J, Bonizzoni, P, Ciccolella, S, Della Vedova, G, Denti, L, Didelot, X, Monti, D, Pirola, Y, Avila Cartes J., Bonizzoni P., Ciccolella S., Della Vedova G., Denti L., Didelot X., Monti D. C., and Pirola Y.
Abstract: Motivation Bacterial genomes present more variability than human genomes, which requires important adjustments in computational tools that are developed for human data. In particular, bacteria exhibit a mosaic structure due to homologous recombinations, but this fact is not sufficiently captured by standard read mappers that align against linear reference genomes. The recent introduction of pangenomics provides some insights in that context, as a pangenome graph can represent the variability within a species. However, the concept of sequence-to-graph alignment that captures the presence of recombinations has not been previously investigated.Results In this paper, we present the extension of the notion of sequence-to-graph alignment to a variation graph that incorporates a recombination, so that the latter are explicitly represented and evaluated in an alignment. Moreover, we present a dynamic programming approach for the special case where there is at most a recombination-we implement this case as RecGraph. From a modelling point of view, a recombination corresponds to identifying a new path of the variation graph, where the new arc is composed of two halves, each extracted from an original path, possibly joined by a new arc. Our experiments show that RecGraph accurately aligns simulated recombinant bacterial sequences that have at most a recombination, providing evidence for the presence of recombination events.Availability and implementation Our implementation is open source and available at https://github.com/AlgoLab/RecGraph.
Published: 2024

42. A New Angle: On Evolving Rotation Symmetric Boolean Functions

Author: Smith, S, Correia, J, Cintrano, C, Carlet, C, Durasevic, M, Gasperov, B, Jakobovic, D, Mariot, L, Picek, S, Carlet, Claude, Durasevic, Marko, Gasperov, Bruno, Jakobovic, Domagoj, Mariot, Luca, Picek, Stjepan, Smith, S, Correia, J, Cintrano, C, Carlet, C, Durasevic, M, Gasperov, B, Jakobovic, D, Mariot, L, Picek, S, Carlet, Claude, Durasevic, Marko, Gasperov, Bruno, Jakobovic, Domagoj, Mariot, Luca, and Picek, Stjepan
Abstract: Rotation symmetric Boolean functions represent an interesting class of Boolean functions as they are relatively rare compared to general Boolean functions. At the same time, the functions in this class can have excellent cryptographic properties, making them interesting for various practical applications. The usage of metaheuristics to construct rotation symmetric Boolean functions is a direction that has been explored for almost twenty years. Despite that, there are very few results considering evolutionary computation methods. This paper uses several evolutionary algorithms to evolve rotation symmetric Boolean functions with different properties. Despite using generic metaheuristics, we obtain results that are competitive with prior work relying on customized heuristics. Surprisingly, we find that bitstring and floating point encodings work better than the tree encoding. Moreover, evolving highly nonlinear general Boolean functions is easier than rotation symmetric ones.
Published: 2024

43. Look into the Mirror: Evolving Self-dual Bent Boolean Functions

Author: Giacobini, M, Xue, B, Manzoni, L, Carlet, C, Durasevic, M, Jakobovic, D, Mariot, L, Picek, S, Carlet, Claude, Durasevic, Marko, Jakobovic, Domagoj, Mariot, Luca, Picek, Stjepan, Giacobini, M, Xue, B, Manzoni, L, Carlet, C, Durasevic, M, Jakobovic, D, Mariot, L, Picek, S, Carlet, Claude, Durasevic, Marko, Jakobovic, Domagoj, Mariot, Luca, and Picek, Stjepan
Abstract: Bent Boolean functions are important objects in cryptography and coding theory, and there are several general approaches for constructing such functions. Metaheuristics proved to be a strong choice as they can provide many bent functions, even when the size of the Boolean function is large (e.g., more than 20 inputs). While bent Boolean functions represent only a small part of all Boolean functions, there are several subclasses of bent functions providing specific properties and challenges. One of the more interesting subclasses comprises (anti-)self-dual bent Boolean functions.This paper provides a detailed experimentation with evolutionary algorithms with the goal of evolving (anti-)self-dual bent Boolean functions. We experiment with two encodings and two fitness functions to evolve self-dual bent Boolean functions. Our experiments consider Boolean functions with sizes of up to 16 inputs, and we successfully construct self-dual bent functions for each dimension. Moreover, we notice that the difficulty of evolving self-dual bent functions is similar to evolving bent Boolean functions, despite self-dual bent functions being much rarer.
Published: 2024

44. Variational autoencoders-enabled high-fidelity reconstruction and effective anomaly detection in EEG data

Author: Cisotto, G, Giulia Cisotto, Cisotto, G, and Giulia Cisotto
Abstract: Electroencephalography (EEG) is a multi-channel time-series that provides information about the individual brain activity for diagnostics, neurorehabilitation, and other applications (including emotions recognition). With the recent success of artificial intelligence in neuroscience, a number of deep learning (DL) models were proposed for classification, anomaly detection, and pattern recognition tasks in EEG. Two main issues challenge the existing DL models for EEG: the large cross-subject variability and the variability of the models training effectiveness depending on the characteristics of the input data. In this talk, I will discuss the most relevant issues to obtain high-fidelity reconstruction of EEG recordings, highlighting the most relevant and successful related work. Then, I will show how we reached almost perfect reconstruction with our hvEEGNet model (based on variational autoencoders, preprint available here). Finally, I will discuss the impact of our work, with special attention on the importance of bringing together domain knowledge and machine learning competences. High-fidelity reconstruction can enable several applications in neuroscience and neurorehabilitation, and at the end of this talk you will be listening about some of them (from brain-computer interface to anomaly detection and transfer learning)
Published: 2024

45. Insights Gained After a Decade of Cellular Automata-Based Cryptography

Author: Mariot, L, Mariot, Luca, Mariot, L, and Mariot, Luca
Abstract: Cellular Automata (CA) have been extensively used to implement symmetric cryptographic primitives, such as pseudorandom number generators and S-boxes. However, most of the research in this field, except the very early works, seems to be published in non-cryptographic venues. This phenomenon poses a problem of relevance: are CA of any use to cryptographers nowadays? This paper provides insights into this question by briefly outlining the history of CA-based cryptography. In doing so, the paper identifies some shortcomings in the research addressing the design of symmetric primitives exclusively from a CA standpoint, alongside some recommendations for future research. Notably, the paper remarks that researchers working in CA and cryptography often tackle similar problems, albeit under different perspectives and terminologies. This observation indicates that there is still ample room for fruitful collaborations between the CA and cryptography communities in the future.
Published: 2024

46. A Discrete Particle Swarm Optimizer for the Design of Cryptographic Boolean Functions

Author: Luca Mariot, Mariot, L, Leporati, A, Manzoni, L, Luca Mariot, Alberto Leporati, Luca Manzoni, Luca Mariot, Mariot, L, Leporati, A, Manzoni, L, Luca Mariot, Alberto Leporati, and Luca Manzoni
Abstract: A Particle Swarm Optimizer for the search of balanced Boolean functions with good cryptographic properties is proposed in this paper. The algorithm is a modified version of the permutation PSO by Hu, Eberhart and Shi which preserves the Hamming weight of the particles positions, coupled with the Hill Climbing method devised by Millan, Clark and Dawson to improve the nonlinearity and deviation from correlation immunity of Boolean functions. The parameters for the PSO velocity equation are tuned by means of two meta-optimization techniques, namely Local Unimodal Sampling (LUS) and Continuous Genetic Algorithms (CGA), finding that CGA produces better results. Using the CGA-evolved parameters, the PSO algorithm is then run on the spaces of Boolean functions from n=7 to n=12 variables. The results of the experiments are reported, observing that this new PSO algorithm generates Boolean functions featuring similar or better combinations of nonlinearity, correlation immunity and propagation criterion with respect to the ones obtained by other optimization methods.
Published: 2024

47. A Systematic Evaluation of Evolving Highly Nonlinear Boolean Functions in Odd Sizes

Author: Claude Carlet, Carlet, C, Ðurasevic, M, Jakobovic, D, Picek, S, Mariot, L, Claude Carlet, Marko Ðurasevic, Domagoj Jakobovic, Stjepan Picek, Luca Mariot, Claude Carlet, Carlet, C, Ðurasevic, M, Jakobovic, D, Picek, S, Mariot, L, Claude Carlet, Marko Ðurasevic, Domagoj Jakobovic, Stjepan Picek, and Luca Mariot
Abstract: Boolean functions are mathematical objects used in diverse applications. Different applications also have different requirements, making the research on Boolean functions very active. In the last 30 years, evolutionary algorithms have been shown to be a strong option for evolving Boolean functions in different sizes and with different properties. Still, most of those works consider similar settings and provide results that are mostly interesting from the evolutionary algorithm's perspective. This work considers the problem of evolving highly nonlinear Boolean functions in odd sizes. While the problem formulation sounds simple, the problem is remarkably difficult, and the related work is extremely scarce. We consider three solutions encodings and four Boolean function sizes and run a detailed experimental analysis. Our results show that the problem is challenging, and finding optimal solutions is impossible except for the smallest tested size. However, once we added local search to the evolutionary algorithm, we managed to find a Boolean function in nine inputs with nonlinearity 241, which, to our knowledge, had never been accomplished before with evolutionary algorithms.
Published: 2024

48. On Maximal Families of Binary Polynomials with Pairwise Linear Common Factors

Author: Maximilien Gadouleau, Gadouleau, M, Mariot, L, Mazzone, F, Maximilien Gadouleau, Luca Mariot, Federico Mazzone, Maximilien Gadouleau, Gadouleau, M, Mariot, L, Mazzone, F, Maximilien Gadouleau, Luca Mariot, and Federico Mazzone
Published: 2024

49. Assessing the Ability of Genetic Programming for Feature Selection in Constructing Dispatching Rules for Unrelated Machine Environments

Author: Đurasević, M, Jakobović, D, Picek, S, Mariot, L, Đurasević, Marko, Jakobović, Domagoj, Picek, Stjepan, Mariot, Luca, Đurasević, M, Jakobović, D, Picek, S, Mariot, L, Đurasević, Marko, Jakobović, Domagoj, Picek, Stjepan, and Mariot, Luca
Abstract: The automated design of dispatching rules (DRs) with genetic programming (GP) has become an important research direction in recent years. One of the most important decisions in applying GP to generate DRs is determining the features of the scheduling problem to be used during the evolution process. Unfortunately, there are no clear rules or guidelines for the design or selection of such features, and often the features are simply defined without investigating their influence on the performance of the algorithm. However, the performance of GP can depend significantly on the features provided to it, and a poor or inadequate selection of features for a given problem can result in the algorithm performing poorly. In this study, we examine in detail the features that GP should use when developing DRs for unrelated machine scheduling problems. Different types of features are investigated, and the best combination of these features is determined using two selection methods. The obtained results show that the design and selection of appropriate features are crucial for GP, as they improve the results by about 7% when only the simplest terminal nodes are used without selection. In addition, the results show that it is not possible to outperform more sophisticated manually designed DRs when only the simplest problem features are used as terminal nodes. This shows how important it is to design appropriate composite terminal nodes to produce high-quality DRs.
Published: 2024

50. Invisible to Machines: Designing AI that Supports Vision Work in Radiology

Author: Anichini, G, Natali, C, Cabitza, F, Anichini G., Natali C., Cabitza F., Anichini, G, Natali, C, Cabitza, F, Anichini G., Natali C., and Cabitza F.
Abstract: In this article we provide an analysis focusing on clinical use of two deep learning-based automatic detection tools in the field of radiology. The value of these technologies conceived to assist the physicians in the reading of imaging data (like X-rays) is generally assessed by the human-machine performance comparison, which does not take into account the complexity of the interpretation process of radiologists in its social, tacit and emotional dimensions. In this radiological vision work, data which informs the physician about the context surrounding a visible anomaly are essential to the definition of its pathological nature. Likewise, experiential data resulting from the contextual tacit knowledge that regulates professional conduct allows for the assessment of an anomaly according to the radiologist's, and patient's, experience. These data, which remain excluded from artificial intelligence processing, question the gap between the norms incorporated by the machine and those leveraged in the daily work of radiologists. The possibility that automated detection may modify the incorporation or the exercise of tacit knowledge raises questions about the impact of AI technologies on medical work. This article aims to highlight how the standards that emerge from the observation practices of radiologists challenge the automation of their vision work, but also under what conditions AI technologies are considered "objective" and trustworthy by professionals.
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

5,525 results on '"INF/01 - INFORMATICA"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources