Author: "Berlincioni, Lorenzo" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Berlincioni, Lorenzo"' showing total 28 results

Start Over Author "Berlincioni, Lorenzo"

28 results on '"Berlincioni, Lorenzo"'

1. Spatio-temporal Transformers for Action Unit Classification with Event Cameras

Author: Cultrera, Luca, Becattini, Federico, Berlincioni, Lorenzo, Ferrari, Claudio, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Face analysis has been studied from different angles to infer emotion, poses, shapes, and landmarks. Traditionally RGB cameras are used, yet for fine-grained tasks standard sensors might not be up to the task due to their latency, making it impossible to record and detect micro-movements that carry a highly informative signal, which is necessary for inferring the true emotions of a subject. Event cameras have been increasingly gaining interest as a possible solution to this and similar high-frame rate tasks. We propose a novel spatiotemporal Vision Transformer model that uses Shifted Patch Tokenization (SPT) and Locality Self-Attention (LSA) to enhance the accuracy of Action Unit classification from event streams. We also address the lack of labeled event data in the literature, which can be considered one of the main causes of an existing gap between the maturity of RGB and neuromorphic vision models. Gathering data is harder in the event domain since it cannot be crawled from the web and labeling frames should take into account event aggregation rates and the fact that static parts might not be visible in certain frames. To this end, we present FACEMORPHIC, a temporally synchronized multimodal face dataset composed of RGB videos and event streams. The dataset is annotated at a video level with facial Action Units and contains streams collected with various possible applications, ranging from 3D shape estimation to lip-reading. We then show how temporal synchronization can allow effective neuromorphic face analysis without the need to manually annotate videos: we instead leverage cross-modal supervision bridging the domain gap by representing face shapes in a 3D space. Our proposed model outperforms baseline methods by effectively capturing spatial and temporal information, crucial for recognizing subtle facial micro-expressions., Comment: Under review at CVIU. arXiv admin note: substantial text overlap with arXiv:2409.10213
Published: 2024

2. Neuromorphic Facial Analysis with Cross-Modal Supervision

Author: Becattini, Federico, Cultrera, Luca, Berlincioni, Lorenzo, Ferrari, Claudio, Leonardo, Andrea, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Traditional approaches for analyzing RGB frames are capable of providing a fine-grained understanding of a face from different angles by inferring emotions, poses, shapes, landmarks. However, when it comes to subtle movements standard RGB cameras might fall behind due to their latency, making it hard to detect micro-movements that carry highly informative cues to infer the true emotions of a subject. To address this issue, the usage of event cameras to analyze faces is gaining increasing interest. Nonetheless, all the expertise matured for RGB processing is not directly transferrable to neuromorphic data due to a strong domain shift and intrinsic differences in how data is represented. The lack of labeled data can be considered one of the main causes of this gap, yet gathering data is harder in the event domain since it cannot be crawled from the web and labeling frames should take into account event aggregation rates and the fact that static parts might not be visible in certain frames. In this paper, we first present FACEMORPHIC, a multimodal temporally synchronized face dataset comprising both RGB videos and event streams. The data is labeled at a video level with facial Action Units and also contains streams collected with a variety of applications in mind, ranging from 3D shape estimation to lip-reading. We then show how temporal synchronization can allow effective neuromorphic face analysis without the need to manually annotate videos: we instead leverage cross-modal supervision bridging the domain gap by representing face shapes in a 3D space., Comment: Accepted for publication at the ECCV 2024 workshop on Neuromorphic Vision: Advantages and Applications of Event Cameras (NEVI)
Published: 2024

3. Garment Attribute Manipulation with Multi-level Attention

Author: Casula, Vittorio, Berlincioni, Lorenzo, Cultrera, Luca, Becattini, Federico, Pero, Chiara, Bisogni, Carmen, Bertini, Marco, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In the rapidly evolving field of online fashion shopping, the need for more personalized and interactive image retrieval systems has become paramount. Existing methods often struggle with precisely manipulating specific garment attributes without inadvertently affecting others. To address this challenge, we propose GAMMA (Garment Attribute Manipulation with Multi-level Attention), a novel framework that integrates attribute-disentangled representations with a multi-stage attention-based architecture. GAMMA enables targeted manipulation of fashion image attributes, allowing users to refine their searches with high accuracy. By leveraging a dual-encoder Transformer and memory block, our model achieves state-of-the-art performance on popular datasets like Shopping100k and DeepFashion., Comment: Accepted for publication at the ECCV 2024 workshop FashionAI
Published: 2024

4. Prompt and Prejudice

Author: Berlincioni, Lorenzo, Cultrera, Luca, Becattini, Federico, Bertini, Marco, and Del Bimbo, Alberto
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: This paper investigates the impact of using first names in Large Language Models (LLMs) and Vision Language Models (VLMs), particularly when prompted with ethical decision-making tasks. We propose an approach that appends first names to ethically annotated text scenarios to reveal demographic biases in model outputs. Our study involves a curated list of more than 300 names representing diverse genders and ethnic backgrounds, tested across thousands of moral scenarios. Following the auditing methodologies from social sciences we propose a detailed analysis involving popular LLMs/VLMs to contribute to the field of responsible AI by emphasizing the importance of recognizing and mitigating biases in these systems. Furthermore, we introduce a novel benchmark, the Pratical Scenarios Benchmark (PSB), designed to assess the presence of biases involving gender or demographic prejudices in everyday decision-making scenarios as well as practical scenarios where an LLM might be used to make sensible decisions (e.g., granting mortgages or insurances). This benchmark allows for a comprehensive comparison of model behaviors across different demographic categories, highlighting the risks and biases that may arise in practical applications of LLMs and VLMs., Comment: Accepted at ECCV workshop FAILED
Published: 2024

5. Neuromorphic valence and arousal estimation

Author: Berlincioni, Lorenzo, Cultrera, Luca, Becattini, Federico, and Bimbo, Alberto Del
Published: 2024
Full Text: View/download PDF

6. Neuromorphic Face Analysis: a Survey

Author: Becattini, Federico, Berlincioni, Lorenzo, Cultrera, Luca, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Emerging Technologies
Abstract: Neuromorphic sensors, also known as event cameras, are a class of imaging devices mimicking the function of biological visual systems. Unlike traditional frame-based cameras, which capture fixed images at discrete intervals, neuromorphic sensors continuously generate events that represent changes in light intensity or motion in the visual field with high temporal resolution and low latency. These properties have proven to be interesting in modeling human faces, both from an effectiveness and a privacy-preserving point of view. Neuromorphic face analysis however is still a raw and unstructured field of research, with several attempts at addressing different tasks with no clear standard or benchmark. This survey paper presents a comprehensive overview of capabilities, challenges and emerging applications in the domain of neuromorphic face analysis, to outline promising directions and open issues. After discussing the fundamental working principles of neuromorphic vision and presenting an in-depth overview of the related research, we explore the current state of available data, standard data representations, emerging challenges, and limitations that require further investigation. This paper aims to highlight the recent process in this evolving field to provide to both experienced and newly come researchers an all-encompassing analysis of the state of the art along with its problems and shortcomings., Comment: Submitted to Patter Recognition Letters
Published: 2024

7. Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

Author: Cioni, Dario, Berlincioni, Lorenzo, Becattini, Federico, and del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Cultural heritage applications and advanced machine learning models are creating a fruitful synergy to provide effective and accessible ways of interacting with artworks. Smart audio-guides, personalized art-related content and gamification approaches are just a few examples of how technology can be exploited to provide additional value to artists or exhibitions. Nonetheless, from a machine learning point of view, the amount of available artistic data is often not enough to train effective models. Off-the-shelf computer vision modules can still be exploited to some extent, yet a severe domain shift is present between art images and standard natural image datasets used to train such models. As a result, this can lead to degraded performance. This paper introduces a novel approach to address the challenges of limited annotated data and domain shifts in the cultural heritage domain. By leveraging generative vision-language models, we augment art datasets by generating diverse variations of artworks conditioned on their captions. This augmentation strategy enhances dataset diversity, bridging the gap between natural images and artworks, and improving the alignment of visual cues with knowledge from general-purpose datasets. The generated variations assist in training vision and language models with a deeper understanding of artistic characteristics and that are able to generate better captions with appropriate jargon., Comment: Accepted at ICCV 2023 4th Workshop on e-Heritage
Published: 2023
Full Text: View/download PDF

8. 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks

Author: Berlincioni, Lorenzo, Berretti, Stefano, Bertini, Marco, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Time varying sequences of 3D point clouds, or 4D point clouds, are now being acquired at an increasing pace in several applications (e.g., LiDAR in autonomous or assisted driving). In many cases, such volume of data is transmitted, thus requiring that proper compression tools are applied to either reduce the resolution or the bandwidth. In this paper, we propose a new solution for upscaling and restoration of time-varying 3D video point clouds after they have been heavily compressed. In consideration of recent growing relevance of 3D applications, %We focused on a model allowing user-side upscaling and artifact removal for 3D video point clouds, a real-time stream of which would require . Our model consists of a specifically designed Graph Convolutional Network (GCN) that combines Dynamic Edge Convolution and Graph Attention Networks for feature aggregation in a Generative Adversarial setting. By taking inspiration PointNet++, We present a different way to sample dense point clouds with the intent to make these modules work in synergy to provide each node enough features about its neighbourhood in order to later on generate new vertices. Compared to other solutions in the literature that address the same task, our proposed model is capable of obtaining comparable results in terms of quality of the reconstruction, while using a substantially lower number of parameters (about 300KB), making our solution deployable in edge computing devices such as LiDAR.
Published: 2023

9. Neuromorphic Event-based Facial Expression Recognition

Author: Berlincioni, Lorenzo, Cultrera, Luca, Albisani, Chiara, Cresti, Lisa, Leonardo, Andrea, Picchioni, Sara, Becattini, Federico, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently, event cameras have shown large applicability in several computer vision fields especially concerning tasks that require high temporal resolution. In this work, we investigate the usage of such kind of data for emotion recognition by presenting NEFER, a dataset for Neuromorphic Event-based Facial Expression Recognition. NEFER is composed of paired RGB and event videos representing human faces labeled with the respective emotions and also annotated with face bounding boxes and facial landmarks. We detail the data acquisition process as well as providing a baseline method for RGB and event data. The collected data captures subtle micro-expressions, which are hard to spot with RGB data, yet emerge in the event domain. We report a double recognition accuracy for the event-based approach, proving the effectiveness of a neuromorphic approach for analyzing fast and hardly detectable expressions and the emotions they conceal.
Published: 2023

10. Upsampling 4D Point Clouds of Human Body via Adversarial Generation

Author: Berlincioni, Lorenzo, Berretti, Stefano, Bertini, Marco, Del Bimbo, Alberto, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Foresti, Gian Luca, editor, Fusiello, Andrea, editor, and Hancock, Edwin, editor
Published: 2024
Full Text: View/download PDF

11. Upsampling 4D Point Clouds of Human Body via Adversarial Generation

Author: Berlincioni, Lorenzo, primary, Berretti, Stefano, additional, Bertini, Marco, additional, and Del Bimbo, Alberto, additional
Published: 2024
Full Text: View/download PDF

12. Partially fake it till you make it: mixing real and fake thermal images for improved object detection

Author: Bongini, Francesco, Berlincioni, Lorenzo, Bertini, Marco, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper we propose a novel data augmentation approach for visual content domains that have scarce training datasets, compositing synthetic 3D objects within real scenes. We show the performance of the proposed system in the context of object detection in thermal videos, a domain where 1) training datasets are very limited compared to visible spectrum datasets and 2) creating full realistic synthetic scenes is extremely cumbersome and expensive due to the difficulty in modeling the thermal properties of the materials of the scene. We compare different augmentation strategies, including state of the art approaches obtained through RL techniques, the injection of simulated data and the employment of a generative model, and study how to best combine our proposed augmentation with these other techniques.Experimental results demonstrate the effectiveness of our approach, and our single-modality detector achieves state-of-the-art results on the FLIR ADAS dataset.
Published: 2021

13. Robust pedestrian detection in thermal imagery using synthesized images

Author: Kieu, My, Berlincioni, Lorenzo, Galteri, Leonardo, Bertini, Marco, Bagdanov, Andrew D., and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper we propose a method for improving pedestrian detection in the thermal domain using two stages: first, a generative data augmentation approach is used, then a domain adaptation method using generated data adapts an RGB pedestrian detector. Our model, based on the Least-Squares Generative Adversarial Network, is trained to synthesize realistic thermal versions of input RGB images which are then used to augment the limited amount of labeled thermal pedestrian images available for training. We apply our generative data augmentation strategy in order to adapt a pretrained YOLOv3 pedestrian detector to detection in the thermal-only domain. Experimental results demonstrate the effectiveness of our approach: using less than 50\% of available real thermal training data, and relying on synthesized data generated by our model in the domain adaptation phase, our detector achieves state-of-the-art results on the KAIST Multispectral Pedestrian Detection Benchmark; even if more real thermal data is available adding GAN generated images to the training data results in improved performance, thus showing that these images act as an effective form of data augmentation. To the best of our knowledge, our detector achieves the best single-modality detection results on KAIST with respect to the state-of-the-art., Comment: Accepted at ICPR2020
Published: 2021

14. Multiple Future Prediction Leveraging Synthetic Trajectories

Author: Berlincioni, Lorenzo, Becattini, Federico, Seidenari, Lorenzo, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Trajectory prediction is an important task, especially in autonomous driving. The ability to forecast the position of other moving agents can yield to an effective planning, ensuring safety for the autonomous vehicle as well for the observed entities. In this work we propose a data driven approach based on Markov Chains to generate synthetic trajectories, which are useful for training a multiple future trajectory predictor. The advantages are twofold: on the one hand synthetic samples can be used to augment existing datasets and train more effective predictors; on the other hand, it allows to generate samples with multiple ground truths, corresponding to diverse equally likely outcomes of the observed trajectory. We define a trajectory prediction model and a loss that explicitly address the multimodality of the problem and we show that combining synthetic and real data leads to prediction improvements, obtaining state of the art results., Comment: Accepted at ICPR2020
Published: 2020

15. Inner Eye Canthus Localization for Human Body Temperature Screening

Author: Ferrari, Claudio, Berlincioni, Lorenzo, Bertini, Marco, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we propose an automatic approach for localizing the inner eye canthus in thermal face images. We first coarsely detect 5 facial keypoints corresponding to the center of the eyes, the nosetip and the ears. Then we compute a sparse 2D-3D points correspondence using a 3D Morphable Face Model (3DMM). This correspondence is used to project the entire 3D face onto the image, and subsequently locate the inner eye canthus. Detecting this location allows to obtain the most precise body temperature measurement for a person using a thermal camera. We evaluated the approach on a thermal face dataset provided with manually annotated landmarks. However, such manual annotations are normally conceived to identify facial parts such as eyes, nose and mouth, and are not specifically tailored for localizing the eye canthus region. As additional contribution, we enrich the original dataset by using the annotated landmarks to deform and project the 3DMM onto the images. Then, by manually selecting a small region corresponding to the eye canthus, we enrich the dataset with additional annotations. By using the manual landmarks, we ensure the correctness of the 3DMM projection, which can be used as ground-truth for future evaluations. Moreover, we supply the dataset with the 3D head poses and per-point visibility masks for detecting self-occlusions. The data will be publicly released.
Published: 2020

16. Semantic Road Layout Understanding by Generative Adversarial Inpainting

Author: Berlincioni, Lorenzo, Becattini, Federico, Galteri, Leonardo, Seidenari, Lorenzo, and Del Bimbo, Alberto
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Autonomous driving is becoming a reality, yet vehicles still need to rely on complex sensor fusion to understand the scene they act in. The ability to discern static environment and dynamic entities provides a comprehension of the road layout that poses constraints to the reasoning process about moving objects. We pursue this through a GAN-based semantic segmentation inpainting model to remove all dynamic objects from the scene and focus on understanding its static components such as streets, sidewalks and buildings. We evaluate this task on the Cityscapes dataset and on a novel synthetically generated dataset obtained with the CARLA simulator and specifically designed to quantitatively evaluate semantic segmentation inpaintings. We compare our methods with a variety of baselines working both in the RGB and segmentation domains.
Published: 2018

17. Road Layout Understanding by Generative Adversarial Inpainting

Author: Berlincioni, Lorenzo, Becattini, Federico, Galteri, Leonardo, Seidenari, Lorenzo, Bimbo, Alberto Del, Escalante, Hugo Jair, Series Editor, Guyon, Isabelle, Series Editor, Escalera, Sergio, Series Editor, Ayache, Stephane, editor, Wan, Jun, editor, Madadi, Meysam, editor, Güçlü, Umut, editor, and Baró, Xavier, editor
Published: 2019
Full Text: View/download PDF

18. Vehicle Trajectories from Unlabeled Data Through Iterative Plane Registration

Author: Becattini, Federico, Seidenari, Lorenzo, Berlincioni, Lorenzo, Galteri, Leonardo, Del Bimbo, Alberto, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ricci, Elisa, editor, Rota Bulò, Samuel, editor, Snoek, Cees, editor, Lanz, Oswald, editor, Messelodi, Stefano, editor, and Sebe, Nicu, editor
Published: 2019
Full Text: View/download PDF

19. 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks

Author: Berlincioni, Lorenzo, primary, Berretti, Stefano, additional, Bertini, Marco, additional, and Bimbo, Alberto Del, additional
Published: 2023
Full Text: View/download PDF

20. Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

Author: Cioni, Dario, primary, Berlincioni, Lorenzo, additional, Becattini, Federico, additional, and Del Bimbo, Alberto, additional
Published: 2023
Full Text: View/download PDF

21. Neuromorphic Event-based Facial Expression Recognition

Author: Berlincioni, Lorenzo, primary, Cultrera, Luca, additional, Albisani, Chiara, additional, Cresti, Lisa, additional, Leonardo, Andrea, additional, Picchioni, Sara, additional, Becattini, Federico, additional, and Del Bimbo, Alberto, additional
Published: 2023
Full Text: View/download PDF

22. Road Layout Understanding by Generative Adversarial Inpainting

Author: Berlincioni, Lorenzo, primary, Becattini, Federico, additional, Galteri, Leonardo, additional, Seidenari, Lorenzo, additional, and Bimbo, Alberto Del, additional
Published: 2019
Full Text: View/download PDF

23. Autonomous Driving Research with CARLA Simulator

Author: Berlincioni, Lorenzo, primary
Published: 2022
Full Text: View/download PDF

24. Partially Fake it Till you Make It

Author: Bongini, Francesco, primary, Berlincioni, Lorenzo, additional, Bertini, Marco, additional, and Del Bimbo, Alberto, additional
Published: 2021
Full Text: View/download PDF

25. Inner Eye Canthus Localization for Human Body Temperature Screening

Author: Ferrari, Claudio, primary, Berlincioni, Lorenzo, additional, Bertini, Marco, additional, and Del Bimbo, Alberto, additional
Published: 2021
Full Text: View/download PDF

26. Multiple Future Prediction Leveraging Synthetic Trajectories

Author: Berlincioni, Lorenzo, primary, Becattini, Federico, additional, Seidenari, Lorenzo, additional, and Del Bimbo, Alberto, additional
Published: 2021
Full Text: View/download PDF

27. Robust pedestrian detection in thermal imagery using synthesized images

Author: Kieu, My, primary, Berlincioni, Lorenzo, additional, Galteri, Leonardo, additional, Bertini, Marco, additional, Bagdanov, Andrew D., additional, and del Bimbo, Alberto, additional
Published: 2021
Full Text: View/download PDF

28. Partially Fake it Till you Make It

Author: Bongini, Francesco, Berlincioni, Lorenzo, Bertini, Marco, and Del Bimbo, Alberto
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

28 results on '"Berlincioni, Lorenzo"'

1. Spatio-temporal Transformers for Action Unit Classification with Event Cameras

2. Neuromorphic Facial Analysis with Cross-Modal Supervision

3. Garment Attribute Manipulation with Multi-level Attention

4. Prompt and Prejudice

5. Neuromorphic valence and arousal estimation

6. Neuromorphic Face Analysis: a Survey

7. Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

8. 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks

9. Neuromorphic Event-based Facial Expression Recognition

10. Upsampling 4D Point Clouds of Human Body via Adversarial Generation

11. Upsampling 4D Point Clouds of Human Body via Adversarial Generation

12. Partially fake it till you make it: mixing real and fake thermal images for improved object detection

13. Robust pedestrian detection in thermal imagery using synthesized images

14. Multiple Future Prediction Leveraging Synthetic Trajectories

15. Inner Eye Canthus Localization for Human Body Temperature Screening

16. Semantic Road Layout Understanding by Generative Adversarial Inpainting

17. Road Layout Understanding by Generative Adversarial Inpainting

18. Vehicle Trajectories from Unlabeled Data Through Iterative Plane Registration

19. 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks

20. Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

21. Neuromorphic Event-based Facial Expression Recognition

22. Road Layout Understanding by Generative Adversarial Inpainting

23. Autonomous Driving Research with CARLA Simulator

24. Partially Fake it Till you Make It

25. Inner Eye Canthus Localization for Human Body Temperature Screening

26. Multiple Future Prediction Leveraging Synthetic Trajectories

27. Robust pedestrian detection in thermal imagery using synthesized images

28. Partially Fake it Till you Make It

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

28 results on '"Berlincioni, Lorenzo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources