Journal: expert systems with applications / Publication Year Range: Last 10 years / Topic: computer vision (cv) - Searchworks@Jio Institute Digital Library Search Results

Showing total 2 results

Start Over Topic computer vision (cv) Publication Year Range Last 10 years Journal expert systems with applications

2 results

1. A comprehensive survey on applications of transformers for deep learning tasks.

Author: Islam, Saidul, Elmekki, Hanae, Elsebai, Ahmed, Bentahar, Jamal, Drawel, Nagat, Rjoub, Gaith, and Pedrycz, Witold
Subjects: *ARTIFICIAL neural networks, *DEEP learning, *TRANSFORMER models, *NATURAL language processing, *RECURRENT neural networks, *COMPUTER vision
Abstract: Transformers are Deep Neural Networks (DNN) that utilize a self-attention mechanism to capture contextual relationships within sequential data. Unlike traditional neural networks and variants of Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), Transformer models excel at managing long dependencies among input sequence elements and facilitate parallel processing. Consequently, Transformer-based models have garnered significant attention from researchers in the field of artificial intelligence. This is due to their tremendous potential and impressive accomplishments, which extend beyond Natural Language Processing (NLP) tasks to encompass various domains, including Computer Vision (CV), audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published, spotlighting the Transformer's contributions in specific fields, architectural disparities, or performance assessments, there remains a notable absence of a comprehensive survey paper that encompasses its major applications across diverse domains. Therefore, this paper addresses this gap by conducting an extensive survey of proposed Transformer models spanning from 2017 to 2022. Our survey encompasses the identification of the top five application domains for Transformer-based models, namely: NLP, CV, multi-modality, audio and speech processing, and signal processing. We analyze the influence of highly impactful Transformer-based models within these domains and subsequently categorize them according to their respective tasks, employing a novel taxonomy. Our primary objective is to illuminate the existing potential and future prospects of Transformers for researchers who are passionate about this area, thereby contributing to a more comprehensive understanding of this groundbreaking technology. • The paper presents a comprehensive survey on transformers for deep learning tasks. • The paper conducts a thorough analysis on highly effective models in five domains. • The paper classifies the models based on respective tasks using a proposed taxonomy. • The characteristics of the surveyed models are deeply explored and analyzed. • Future directions and challenges for transformer-based models are deciphered. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey.

Author: Sharma, Dhruv, Dhiman, Chhavi, and Kumar, Dinesh
Subjects: *DEEP learning, *NATURAL language processing, *COMPUTER vision, *VIDEO surveillance, *HUMAN-robot interaction, *PEOPLE with visual disabilities
Abstract: Automatic Visual Captioning (AVC) generates syntactically and semantically correct sentences by describing important objects, attributes, and their relationships with each other. It is classified into two categories: image captioning and video captioning. It is widely used in various applications such as assistance for the visually impaired, human-robot interaction, video surveillance systems, scene understanding, etc. With the unprecedented success of deep-learning in Computer Vision and Natural Language Processing, the past few years have seen a surge of research in this domain. In this survey, the state-of-the-art is classified based on how they conceptualize the captioning problem, viz., traditional approaches that cast visual description either as retrieval or template-based description and deep learning approaches. A detailed review of existing methods, highlighting their pros and cons, societal impact as the number of citations, architectures used, datasets experimented on and GitHub link is presented. Moreover, the survey also provides an overview of the benchmark image and video datasets and the evaluation measures that have been developed to assess the quality of machine-generated captions. It is observed that dense or paragraph generation and Change Image Captioning (CIC) are stimulating the research community more due to the near-to-human abstraction ability. Finally, the paper explores future directions in the area of automatic visual caption generation. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results

1. A comprehensive survey on applications of transformers for deep learning tasks.

2. Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey.

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

2 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources