910 results on '"[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]"'
Search Results
2. Deep Learning identifies new morphological patterns of Homologous Recombination Deficiency in luminal breast cancers from whole slide images
- Author
-
Marc-Henri Stern, Etienne Decencière, Thomas Walter, François-Clément Bidard, Tristan Lazard, Guillaume Bataillon, Dominique Stoppa-Lyonnet, Tatiana Popova, Peter Naylor, Anne Vincent Salomon, Centre de Bioinformatique (CBIO), Mines Paris - PSL (École nationale supérieure des mines de Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Cancer et génome: Bioinformatique, biostatistiques et épidémiologie d'un système complexe, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut Curie [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM), Unité de génétique et biologie des cancers (U830), Institut Curie [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM), Institut Curie - Saint Cloud (ICSC), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ), Centre de Morphologie Mathématique (CMM), Université Paris sciences et lettres (PSL), Génétique et Biologie du Développement, Institut Curie [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), ANR-17-CONV-0005,Q-LIFE,Institut Q-LIFE(2017), ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), Walter, Thomas, Institut Q-LIFE - - Q-LIFE2017 - ANR-17-CONV-0005 - CONV - VALID, and PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID
- Subjects
Genome instability ,homologous recombination deficiency (HRD) ,bias ,whole slide images ,Computational biology ,Biology ,breast cancer ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,visualization ,[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM] ,Interpretability ,[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,business.industry ,Deep learning ,deep learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,prediction ,artificial intelligence ,molecular subtype ,Phenotype ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,PARP inhibitor ,Artificial intelligence ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,interpretability ,Homologous recombination ,business ,Homologous Recombination Deficiency ,computational pathology - Abstract
Homologous Recombination DNA-repair deficiency (HRD) is a well-recognized marker of platinum-salt and PARP inhibitor chemotherapies in ovarian and breast cancers (BC). Causing high genomic instability, HRD is currently determined by BRCA1/2 sequencing or by genomic signatures, but its morphological manifestation is not well understood. Deep Learning (DL) is a powerful machine learning technique that has been recently shown to be capable of predicting genomic signatures from stained tissue slides. However, DL is known to be sensitive to dataset biases and lacks interpretability. Here, we present and evaluate a strategy to control for biases in retrospective cohorts. We train a deep-learning model to predict the HRD in a controlled cohort with unprecedented accuracy (AUC: 0.86) and we develop a new visualization technique that allows for automatic extraction of new morphological features related to HRD. We analyze in detail the extracted morphological patterns that open new hypotheses on the phenotypic impact of HRD.
- Published
- 2022
3. MSMT-CNN for Solar Active Region Detection with Multi-Spectral Analysis
- Author
-
Majedaldein Almahasneh, Adeline Paiement, Xianghua Xie, Jean Aboudarham, Paiement, Adeline, Department of Computer Science [Swansea], Swansea University, Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), DYNamiques de l’Information (DYNI), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Observatoire de Paris, Université Paris sciences et lettres (PSL), Laboratoire d'études spatiales et d'instrumentation en astrophysique = Laboratory of Space Studies and Instrumentation in Astrophysics (LESIA), Institut national des sciences de l'Univers (INSU - CNRS)-Observatoire de Paris, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[PHYS.ASTR.IM]Physics [physics]/Astrophysics [astro-ph]/Instrumentation and Methods for Astrophysic [astro-ph.IM] ,General Computer Science ,Object detection ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer Networks and Communications ,Solar images ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Artificial Intelligence ,Deep neural networks ,Active regions ,Multi-spectral images ,[PHYS.ASTR.SR] Physics [physics]/Astrophysics [astro-ph]/Solar and Stellar Astrophysics [astro-ph.SR] ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[PHYS.ASTR.SR]Physics [physics]/Astrophysics [astro-ph]/Solar and Stellar Astrophysics [astro-ph.SR] ,Computer Graphics and Computer-Aided Design ,Computer Science Applications ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,Computational Theory and Mathematics ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,[PHYS.ASTR.IM] Physics [physics]/Astrophysics [astro-ph]/Instrumentation and Methods for Astrophysic [astro-ph.IM] - Abstract
Precisely detecting solar active regions (AR) from multi-spectral images is a challenging task yet important in understanding solar activity and its influence on space weather. A main challenge comes from each modality capturing a different location of these 3D objects, as opposed to more traditional multi-spectral imaging scenarios where all image bands observe the same scene. We present a multi-task deep learning framework that exploits the dependencies between image bands to produce 3D AR detection where different image bands (and physical locations) each have their own set of results. Different feature fusion strategies are investigated in this work, where information from different image modalities is aggregated at different semantic levels throughout the network. This allows the network to benefit from the joint analysis while preserving the band-specific information. We compare our detection method against baseline approaches for solar image analysis (multi-channel coronal hole detection, SPOCA for ARs (Verbeeck et al. Astron Astrophys 561:16, 2013)) and a state-of-the-art deep learning method (Faster RCNN) and show enhanced performances in detecting ARs jointly from multiple bands. We also evaluate our proposed approach on synthetic data of similar spatial configurations obtained from annotated multi-modal magnetic resonance images.
- Published
- 2022
4. Generalized Feedback Loop for Joint Hand-Object Pose Estimation
- Author
-
Vincent Lepetit, Paul Wohlhart, Markus Oberweger, Institute for Computer Graphics and Vision [Graz] (ICG), Graz University of Technology [Graz] (TU Graz), Google Inc., Google Inc [Mountain View], Research at Google-Research at Google, Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université Sciences et Technologies - Bordeaux 1-Université Bordeaux Segalen - Bordeaux 2, and Lepetit, Vincent
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Convolutional neural network ,Image (mathematics) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Pose ,business.industry ,Applied Mathematics ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020207 software engineering ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Feedback loop ,Object (computer science) ,Computational Theory and Mathematics ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Joint (audio engineering) ,business ,Software - Abstract
We propose an approach to estimating the 3D pose of a hand, possibly handling an object, given a depth image. We show that we can correct the mistakes made by a Convolutional Neural Network trained to predict an estimate of the 3D pose by using a feedback loop. The components of this feedback loop are also Deep Networks, optimized using training data. This approach can be generalized to a hand interacting with an object. Therefore, we jointly estimate the 3D pose of the hand and the 3D pose of the object. Our approach performs en-par with state-of-the-art methods for 3D hand pose estimation, and outperforms state-of-the-art methods for joint hand-object pose estimation when using depth images only. Also, our approach is efficient as our implementation runs in real-time on a single GPU., arXiv admin note: substantial text overlap with arXiv:1609.09698
- Published
- 2020
5. Learning the spatiotemporal variability in longitudinal shape data sets
- Author
-
Stanley Durrleman, Alexandre Bône, Olivier Colliot, Algorithms, models and methods for images and signals of the human brain (ARAMIS), Sorbonne Université (SU)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [APHP]-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [APHP]-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), This work has been partly funded by the European Research Council with grant 678304, European Union’s Horizon 2020 research and innovation program with grant 666992, and the program Investissements d’avenir ANR-10-IAIHU-06.Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bio-engineering, and through generous contributions from the following: AbbVie, Alzheimers Association, Alzheimers Drug Discovery Foundation, Araclon Biotech, BioClinica, Inc., Biogen, Bristol-Myers Squibb Company, CereSpir, Inc., Cogstate, Eisai Inc., Elan Pharmaceuticals, Inc., Eli Lilly and Company, EuroImmun, F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc., Fujirebio, GE Healthcare, IXICO Ltd., Janssen Alzheimer Immunotherapy Research & Development, LLC., Johnson & Johnson Pharmaceutical Research & Development LLC., Lumosity, Lundbeck, Merck & Co., Inc., Meso Scale Diagnostics, LLC., NeuroRx Research, Neurotrack Technologies, Novartis Pharmaceuticals Corporation, Pfizer Inc., Piramal Imaging, Servier, Takeda Pharmaceutical Company, and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimers Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California., Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), ANR-05-PADD-0003,TRANS,Transformations de l'élevage et dynamiques des espaces(2005), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Bône, Alexandre, PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID, and Programme fédérateur Agriculture et Développement Durable - Transformations de l'élevage et dynamiques des espaces - - TRANS2005 - ANR-05-PADD-0003 - ADD - VALID
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Disease progression modeling ,Large deformation diffeomorphic metric mapping ,Computer science ,Statistical shape analysis ,02 engineering and technology ,Stochastic approximation ,Computational morphometry ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[STAT.ML]Statistics [stat]/Machine Learning [stat.ML] ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,[SDV.NEU] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC] ,Set (psychology) ,[INFO.INFO-MS]Computer Science [cs]/Mathematical Software [cs.MS] ,Longitudinal data ,business.industry ,Disease progression modelin ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Statistical model ,[STAT.ML] Statistics [stat]/Machine Learning [stat.ML] ,[INFO.INFO-MS] Computer Science [cs]/Mathematical Software [cs.MS] ,[MATH.MATH-DG]Mathematics [math]/Differential Geometry [math.DG] ,Pattern recognition (psychology) ,Trajectory ,[SDV.NEU]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC] ,020201 artificial intelligence & image processing ,Medical imaging ,Computer Vision and Pattern Recognition ,Noise (video) ,Artificial intelligence ,[MATH.MATH-DG] Mathematics [math]/Differential Geometry [math.DG] ,business ,Software - Abstract
International audience; In this paper, we propose a generative statistical model to learn the spatiotemporal variability in longitudinal shape data sets, which contain repeated observations of a set of objects or individuals over time. From all the short-term sequences of individual data, the method estimates a long-term normative scenario of shape changes and a tubular coordinate system around this trajectory. Each individual data sequence is therefore (i) mapped onto a specific portion of the trajectory accounting for differences in pace of progression across individuals, and (ii) shifted in the shape space to account for intrinsic shape differences across individuals that are independent of the progression of the observed process. The parameters of the model are estimated using a stochastic approximation of the expectation–maximization algorithm. The proposed approach is validated on a simulated data set, illustrated on the analysis of facial expression in video sequences, and applied to the modeling of the progressive atrophy of the hippocampus in Alzheimer’s disease patients. These experiments show that one can use the method to reconstruct data at the precision of the noise, to highlight significant factors that may modulate the progression, and to simulate entirely synthetic longitudinal data sets reproducing the variability of the observed process.
- Published
- 2020
6. Multi-scale superpatch matching using dual superpixel descriptors
- Author
-
Merlin Boyer, Rémi Giraud, Michaël Clément, Giraud, Rémi, Institut Polytechnique de Bordeaux (Bordeaux INP), Laboratoire de l'intégration, du matériau au système (IMS), Université Sciences et Technologies - Bordeaux 1-Institut Polytechnique de Bordeaux-Centre National de la Recherche Scientifique (CNRS), Laboratoire Bordelais de Recherche en Informatique (LaBRI), and Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
- Subjects
FOS: Computer and information sciences ,business.industry ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Dimensionality reduction ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Image processing ,02 engineering and technology ,01 natural sciences ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,Artificial Intelligence ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,0103 physical sciences ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,010306 general physics ,business ,Software - Abstract
International audience; Over-segmentation into superpixels is a very effective dimensionality reduction strategy, enabling fast dense image processing. The main issue of this approach is the inherent irregularity of the image decomposition compared to standard hierarchical multi-resolution schemes, especially when searching for similar neighboring patterns. Several works have attended to overcome this issue by taking into account the region irregularity into their comparison model. Nevertheless, they remain sub-optimal to provide robust and accurate superpixel neighborhood descriptors, since they only compute features within each region, poorly capturing contour information at superpixel borders. In this work, we address these limitations by introducing the dual superpatch, a novel superpixel neighborhood descriptor. This structure contains features computed in reduced superpixel regions, as well as at the interfaces of multiple superpixels to explicitly capture contour structure information. A fast multi-scale non-local matching framework is also introduced for the search of similar descriptors at different resolution levels in an image dataset. The proposed dual superpatch enables to more accurately capture similar structured patterns at different scales, and we demonstrate the robustness and performance of this new strategy on matching and supervised labeling applications.
- Published
- 2020
7. Approche multi-critère pour la caractérisation des adventices
- Author
-
Vayssade, Jehan-Antoine, vayssade, jehan-antoine, and Boucles sensorimotrices robotiques pour le désherbage autonome - - ROSEAU2017 - ANR-17-ROSE-0002 - Challenge ROSE - VALID
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Artificial intelligence ,Precision agriculture ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Agriculture de précision ,Statistics ,[SDV.SA.STA] Life Sciences [q-bio]/Agricultural sciences/Sciences and technics of agriculture ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Intelligence artificielle ,Prédiction ,Image analysis ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Vision par ordinateur ,Statistiques ,Computer vision ,Prediction ,Analyse d'image - Abstract
The objective of this thesis is to develop a way to detect weeds in a field using multispectral images, in order to determine which weeds should be eliminated during the current crop cycle and more particularly at the early stages. The multi-criteria approach focuses on the spatial arrangement, the spectral signature, the morphology and the texture of the plants located in the plots. This work proposes a method for selecting the best criteria for optimal discrimination for a given setup. Prior to the extraction of these criteria, a set of methods was developed in order to correct the errors of the acquisition device, to precisely detect the vegetation and then to identify within the vegetation the individuals on which the different criteria can be computed. For the individual detection step, it appears that leaf scale is more suitable than plant scale. Vegetation detection and leaf identification are based on deep learning methods capable of processing dense foliage. The introduction of these methods in a usual processing chain constitutes the originality of this manuscript where each part was the subject of an article. Concerning the acquisition device, a method of spectral band registration was developed. Then, new vegetation indices based on artificial intelligence constitute one of the scientific advances of this thesis. As an indication, these indices offer a mIoU of 82.19% when standard indices ceil at 63.93%-73.71%. By extension, a leaf detection method was defined and is based on the detection of their contours, this method seems advantageous on our multispectral data. Finally, the best property pairs were defined for crop/weed discrimination at leaf level, whith classification performances up to 91%., L'objectif de cette thèse est de mettre au point un moyen de détecter les adventices dans un champ à l'aide d'images multispectrales, afin de pouvoir déterminer quelles sont les adventices à éliminer pendant le cycle de culture en cours et plus particulièrement aux stades précoces. L'approche multi-critère s'intéresse à la disposition spatiale, à la signature spectrale, à la morphologie et à la texture des plantes présentes dans les parcelles. Ce manuscrit propose une méthode permettant de sélectionner les meilleurs critères pour une discrimination optimale dans un contexte donné. Préalablement à l'extraction de ces critères, un ensemble de méthodes ont été développées afin de corriger les erreurs du dispositif d'acquisition, de détecter précisément la végétation, puis d'identifier au sein de la végétation les individus sur lesquels les différents critères peuvent être extraits. Pour l'étape de détection des individus, il s'est révélé que l'échelle de la feuille était plus adaptée que celle de la plante. La détection de la végétation et l'identification des feuilles s'appuient sur des méthodes d'apprentissage profond, capables de traiter des feuillages denses. L'introduction de ces méthodes dans une chaîne de traitement usuelle constitue l'originalité de ce manuscrit où chaque partie a fait l'objet d'un article. Concernant le dispositif d'acquisition, une méthode de recalage des bandes spectrales a été développée, et les résultats montrent une précision de l'ordre du pixel. Ensuite, de nouveaux indices de végétation reposant sur de l'intelligence artificielle constituent l'une des avancées scientifiques de cette thèse. A titre indicatif, ces indices permettent d'atteindre 82.19% de mIoU contre 63.93%-73.71% pour des indices standards et fonctionnent en environnement non-contrôlé. Par extension, une méthode de détection des feuilles a été définie. Elle repose sur la détection de leurs contours, et semble avantageuse sur nos données multispectrales. Finalement, les meilleurs couples de propriétés ont été définis pour la discrimination culture/adventices à l'échelle de la feuille, dont les performances atteignent 91% de classification.
- Published
- 2022
8. Artificial intelligence: your questions answered
- Author
-
Lucey, Simon, Ma-Wyatt, Anna, van den Hengel, Anton, Reid, Ian, Nicholson, Kathy, Dalby, Paul A., Mcmillen, Caroline, Evans, Michael, Wallace, Catriona, Monro, Tanya, Shoebridge, Michael, Slonim, Adam, Schuber, Misha, Patrick, Rex, Reid, Adam, Diguet, Jean-Philippe, and Dr Kathy Nicholson and Adam Slonim
- Subjects
Machine Learning ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Artificial Intelligence ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] - Published
- 2022
9. SynWoodScape: Synthetic Surround-view Fisheye Camera Dataset for Autonomous Driving
- Author
-
Ahmed Rida Sekkat, Yohan Dupuis, Varun Ravi Kumar, Hazem Rashed, Senthil Yogamani, Pascal Vasseur, Paul Honeine, and Honeine, Paul
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,FOS: Computer and information sciences ,Control and Optimization ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Mechanical Engineering ,Fisheye Cameras ,Computer Vision and Pattern Recognition (cs.CV) ,Biomedical Engineering ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[STAT.ML] Statistics [stat]/Machine Learning [stat.ML] ,Computer Science Applications ,Human-Computer Interaction ,[INFO.INFO-CY] Computer Science [cs]/Computers and Society [cs.CY] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Artificial Intelligence ,Control and Systems Engineering ,Computer Vision and Pattern Recognition ,Omnidirectional vision ,Automated Driving ,Synthetic Datasets ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing - Abstract
Surround-view cameras are a primary sensor for automated driving, used for near-field perception. It is one of the most commonly used sensors in commercial vehicles primarily used for parking visualization and automated parking. Four fisheye cameras with a 190{\deg} field of view cover the 360{\deg} around the vehicle. Due to its high radial distortion, the standard algorithms do not extend easily. Previously, we released the first public fisheye surround-view dataset named WoodScape. In this work, we release a synthetic version of the surround-view dataset, covering many of its weaknesses and extending it. Firstly, it is not possible to obtain ground truth for pixel-wise optical flow and depth. Secondly, WoodScape did not have all four cameras annotated simultaneously in order to sample diverse frames. However, this means that multi-camera algorithms cannot be designed to obtain a unified output in birds-eye space, which is enabled in the new dataset. We implemented surround-view fisheye geometric projections in CARLA Simulator matching WoodScape's configuration and created SynWoodScape. We release 80k images from the synthetic dataset with annotations for 10+ tasks. We also release the baseline code and supporting scripts., Comment: IEEE Robotics and Automation Letters (RA-L) and IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022). An initial sample of the dataset is released in https://drive.google.com/drive/folders/1N5rrySiw1uh9kLeBuOblMbXJ09YsqO7I
- Published
- 2022
- Full Text
- View/download PDF
10. Learning Laplacians in Chebyshev Graph Convolutional Networks
- Author
-
Hichem Sahbi, Machine Learning and Information Access (MLIA), LIP6, Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), and Sahbi, Hichem
- Subjects
Discrete mathematics ,[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,business.industry ,Computer science ,Graph (abstract data type) ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Artificial intelligence ,business ,Chebyshev filter ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; Spectral graph convolutional networks (GCNs) are particular deep models which aim at extending neural networks to arbitrary irregular domains. The principle of these networks consists in projecting graph signals using the eigendecomposition of their Laplacians, then achieving filtering in the spectral domain prior to back-project the resulting filtered signals onto the input graph domain. However, the success of these operations is highly dependent on the relevance of the used Laplacians which are mostly handcrafted and this makes GCNs clearly sub-optimal. In this paper, we introduce a novel spectral GCN that learns not only the usual convolutional parameters but also the Laplacian operators. The latter are designed "end-to-end" as a part of a recursive Chebyshev decomposition with the particularity of conveying both the differential and the non-differential properties of the learned representations -- with increasing order and discrimination power -- without overparametrizing the trained GCNs. Extensive experiments, conducted on the challenging task of skeleton-based action recognition, show the generalization ability and the outperformance of our proposed Laplacian design w.r.t. different baselines (built upon handcrafted and other learned Laplacians) as well as the related work.
- Published
- 2021
11. Graph kernels based on linear patterns: Theoretical and experimental comparisons
- Author
-
Paul Honeine, Benoit Gaüzère, Linlin Jia, Equipe Apprentissage (DocApp - LITIS), Laboratoire d'Informatique, de Traitement de l'Information et des Systèmes (LITIS), Université Le Havre Normandie (ULH), Normandie Université (NU)-Normandie Université (NU)-Université de Rouen Normandie (UNIROUEN), Normandie Université (NU)-Institut national des sciences appliquées Rouen Normandie (INSA Rouen Normandie), Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA)-Université Le Havre Normandie (ULH), Institut National des Sciences Appliquées (INSA)-Normandie Université (NU)-Institut National des Sciences Appliquées (INSA), ANR-18-CE23-0014,APi,Apprivoiser la Pré-image(2018), PNRIA, Honeine, Paul, and APPEL À PROJETS GÉNÉRIQUE 2018 - Apprivoiser la Pré-image - - APi2018 - ANR-18-CE23-0014 - AAPG2018 - VALID
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Theoretical computer science ,Computational complexity theory ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer science ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,02 engineering and technology ,Python Implementation ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[STAT.ML]Statistics [stat]/Machine Learning [stat.ML] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-CY]Computer Science [cs]/Computers and Society [cs.CY] ,Artificial Intelligence ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,020204 information systems ,Machine learning ,0202 electrical engineering, electronic engineering, information engineering ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing ,computer.programming_language ,Linear Patterns ,Walks ,Graph representations ,Paths ,General Engineering ,Graph Kernels ,Kernel methods ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Python (programming language) ,Graph representation ,[STAT.ML] Statistics [stat]/Machine Learning [stat.ML] ,Regression ,Computer Science Applications ,Vertex (geometry) ,[INFO.INFO-CY] Computer Science [cs]/Computers and Society [cs.CY] ,Kernel method ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,computer ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,MathematicsofComputing_DISCRETEMATHEMATICS - Abstract
International audience; Graph kernels are powerful tools to bridge the gap between machine learning and data encoded as graphs. Most graph kernels are based on the decomposition of graphs into a set of patterns. The similarity between two graphs is then deduced to the similarity between corresponding patterns. Kernels based on linear patterns constitute a good trade-off between accuracy and computational complexity. In this work, we propose a thorough investigation and comparison of graph kernels based on different linear patterns, namely walks and paths. First, all these kernels are explored in detail, including their mathematical foundations, structures of patterns and computational complexity. After that, experiments are performed on various benchmark datasets exhibiting different types of graphs, including labeled and unlabeled graphs, graphs with different numbers of vertices, graphs with different average vertex degrees, linear and non-linear graphs. Finally, for regression and classification tasks, accuracy and computational complexity of these kernels are compared and analyzed, in the light of baseline kernels based on non-linear patterns. Suggestions are proposed to choose kernels according to the types of graph datasets. This work leads to a clear comparison of strengths and weaknesses of these kernels. An open-source Python library containing an implementation of all discussed kernels is publicly available on GitHub to the community, thus allowing to promote and facilitate the use of graph kernels in machine learning problems.
- Published
- 2021
12. A Registration Error Estimation Framework for Correlative Imaging
- Author
-
Guillaume Potier, Frédéric Lavancier, Stephan Kunne, Perrine Paul-Gilloteaux, Unité de recherche de l'institut du thorax (ITX-lab), Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Université de Nantes - UFR de Médecine et des Techniques Médicales (UFR MEDECINE), Université de Nantes (UN)-Université de Nantes (UN), Laboratoire de Mathématiques Jean Leray (LMJL), Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-Centre National de la Recherche Scientifique (CNRS), Structure fédérative de recherche François Bonamy (SFR François Bonamy), Université de Nantes (UN)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Santé de l'Université de Nantes (IRS-UN), ANR-18-CE45-0015,CROCOVAL,Recalage transmodal en microscopies corrélatives pour la caractérisation physiopathologique de la valvulopathie(2018), ANR-10-INBS-0004,France-BioImaging,Développment d'une infrastructure française distribuée coordonnée(2010), European Project: CA17121,COMULIS, Paul-Gilloteaux, Perrine, APPEL À PROJETS GÉNÉRIQUE 2018 - Recalage transmodal en microscopies corrélatives pour la caractérisation physiopathologique de la valvulopathie - - CROCOVAL2018 - ANR-18-CE45-0015 - AAPG2018 - VALID, Développment d'une infrastructure française distribuée coordonnée - - France-BioImaging2010 - ANR-10-INBS-0004 - INBS - VALID, COST COMULIS Correlated Multimodal Imaging in Life Science - COMULIS - CA17121 - INCOMING, unité de recherche de l'institut du thorax UMR1087 UMR6291 (ITX), Université de Nantes - UFR de Médecine et des Techniques Médicales (UFR MEDECINE), Université de Nantes (UN)-Université de Nantes (UN)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), and ANR-11-LABX-0020,LEBESGUE,Centre de Mathématiques Henri Lebesgue : fondements, interactions, applications et Formation(2011)
- Subjects
Correlative ,FOS: Computer and information sciences ,Computer science ,[SDV.IB.IMA]Life Sciences [q-bio]/Bioengineering/Imaging ,Computer Vision and Pattern Recognition (cs.CV) ,[SDV]Life Sciences [q-bio] ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Quantitative Biology - Quantitative Methods ,Image (mathematics) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[STAT.AP] Statistics [stat]/Applications [stat.AP] ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Point (geometry) ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] ,Quantitative Methods (q-bio.QM) ,[STAT.AP]Statistics [stat]/Applications [stat.AP] ,business.industry ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020207 software engineering ,Sample (graphics) ,[SDV] Life Sciences [q-bio] ,Transformation (function) ,Workflow ,[SDV.IB.IMA] Life Sciences [q-bio]/Bioengineering/Imaging ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,FOS: Biological sciences ,Computer Science::Computer Vision and Pattern Recognition ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,020201 artificial intelligence & image processing ,Artificial intelligence ,Affine transformation ,Noise (video) ,business - Abstract
Correlative imaging workflows are now widely used in bioimaging and aims to image the same sample using at least two different and complementary imaging modalities. Part of the workflow relies on finding the transformation linking a source image to a target image. We are specifically interested in the estimation of registration error in point-based registration. We propose an application of multivariate linear regression to solve the registration problem allowing us to propose a framework for the estimation of the associated error in the case of rigid and affine transformations and with anisotropic noise. These developments can be used as a decision-support tool for the biologist to analyze multimodal correlative images and are available under Ec-CLEM, an open-source plugin under ICY., Comment: 10 pages 2 figures (made of 10 panels in total)
- Published
- 2021
13. 3D-Aware Ellipse Prediction for Object-Based Camera Pose Estimation
- Author
-
Marie-Odile Berger, Matthieu Zins, Gilles Simon, Recalage visuel avec des modèles physiquement réalistes (TANGRAM), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Department of Algorithms, Computation, Image and Geometry (LORIA - ALGO), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique [CNRS], Zins, Matthieu, Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), Augmentation visuelle d'environnements complexes (MAGRIT-POST), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Algorithms, Computation, Image and Geometry (LORIA - ALGO), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Visual Augmentation of Complex Environments (MAGRIT), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), and Sciencesconf.org, CCSD
- Subjects
FOS: Computer and information sciences ,Camera pose ,I.4 ,Computer science ,Computation ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,[INFO] Computer Science [cs] ,010501 environmental sciences ,Ellipse ,01 natural sciences ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,détection d'objets ,0202 electrical engineering, electronic engineering, information engineering ,localisation visuelle ,Computer vision ,[INFO]Computer Science [cs] ,Pose ,0105 earth and related environmental sciences ,Ground truth ,ellipse ,business.industry ,65D19 ,Deep learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,object detection ,Object (computer science) ,Ellipsoid ,020201 artificial intelligence & image processing ,Augmented reality ,Artificial intelligence ,ellipsoid ,ellipsoïde ,business ,Pose de caméra - Abstract
In this paper, we propose a method for coarse camera pose computation which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are available. It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions. Previous works have also shown that abstracting the geometry of a scene of objects by an ellipsoid cloud allows to compute the camera pose accurately enough for various application needs. Though promising, these approaches use the ellipses fitted to the detection bounding boxes as an approximation of the imaged objects. In this paper, we go one step further and propose a learning-based method which detects improved elliptic approximations of objects which are coherent with the 3D ellipsoid in terms of perspective projection. Experiments prove that the accuracy of the computed pose significantly increases thanks to our method and is more robust to the variability of the boundaries of the detection boxes. This is achieved with very little effort in terms of training data acquisition -- a few hundred calibrated images of which only three need manual object annotation. Code and models are released at https://github.com/zinsmatt/3D-Aware-Ellipses-for-Visual-Localization., Comment: Presented at 3DV 2020. Code and models released at https://github.com/zinsmatt/3D-Aware-Ellipses-for-Visual-Localization
- Published
- 2021
14. Morphological and logarithmic analysis of large image databases
- Author
-
Guillaume Noyel, Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA), University of Strathclyde [Glasgow], Université de Reims Champagne-Ardenne, Michel Jourlin, European Project: 717108,H2020,H2020-SMEINST-1-2015,Eye Light(2016), Noyel, Guillaume, and Eye fundus colour images enhancement service for Diabetic Retinopathy diagnosis - Eye Light - - H20202016-03-01 - 2016-08-31 - 717108 - VALID
- Subjects
Artificial intelligence ,Eye fundus images ,Radiographie ,imagerie médicale ,Reconstruction 3D ,Calcul vectoriel ,Colour ,percolation ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Segmentation ,Métriques fonctionnelles d’Asplund ,Diabetic retinopathy ,Vector processing ,region homogeneity ,stereovision ,Image registration ,ACM: I.: Computing Methodologies/I.5: PATTERN RECOGNITION ,Public health ,Radial distortions ,segmentation vasculaire ,multimodal acquisition ,vessel segmentation ,[MATH.MATH-NA] Mathematics [math]/Numerical Analysis [math.NA] ,Acquisition multimodale ,Recalage d’image déformable ,Robustness to lighting variations ,Contrastes perceptuels ,Texture analysis ,low contrast images ,Accélération d’algorithmes ,Images à faibles contrastes ,Parallelisation ,large image databases ,Medical imaging ,logarithmic mathematical morphology ,Calibrage ,fond d’oeil ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,radiography ,ACM: I.: Computing Methodologies ,[MATH.MATH-NA]Mathematics [math]/Numerical Analysis [math.NA] ,Apprentissage profond ,Contrast enhancement ,Apprentissage statistique ,rétine ,perceptual contrasts ,Morphologie mathématique logarithmique ,Logarithmic image processing ,Retina ,Recalage d’image ,rétinopathie diabétique ,Industrial control ,B-spline ,B-splines ,distorsion radiale ,Machine learning ,traitement massif d’image ,Robustesse aux variations d’éclairement ,Industrie ,functional Asplund metrics ,Industry ,Contrôle industriel ,3D reconstruction ,speed up of algorithms ,ACM: G.: Mathematics of Computing/G.1: NUMERICAL ANALYSIS ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing ,ACM: I.: Computing Methodologies/I.4: IMAGE PROCESSING AND COMPUTER VISION ,grandes banques d’images ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Deep learning ,Intelligence artificielle ,calibration ,Morphologie mathématique ,massive image processing ,Mathematical morphology ,Parallélisation ,santé publique ,Couleur ,homographie affine ,affine homography ,Analyse de texture ,Deformable image registration ,Homogénéité de région ,Amélioration des contrastes ,Stéréovision - Abstract
With the massive use of digital photography, large databases of images have been built during the last decades, both for industry and medicine. However, the images were generally captured with different lighting conditions, poses of the camera and types of camera. This causes difficulties when comparing images of the same scene. I have proposed methods to analyse these large databases which address these issues. They have been applied to the field of tyres for the control of their visual aspect and the study of their performances as well as to the field of healthcare for the diagnosis and the follow-up of diabetic retinopathy. These methods fall into the framework of Mathematical Morphology and Logarithmic Image processing (LIP). In particular, thanks to partnerships with academia, I have developed approaches of morphological segmentation and classification by machine learning. They have been used for the analysis of images of tyre surfaces (which can be textured or in three dimensions) in order to look for defaults. They have also served for the automatic analysis of eye fundus images from patients with diabetes. Due to the large amount of data to analyse, I have speeded up algorithms with regards to their complexity as well as their efficient programming. The program can be run in parallel over several processor cores or written in a vector way (i.e., several numbers processed at the same time in place of a single one). Thanks to the LIP model, I have proposed methods to improve colour contrast in retinal images. I have studied the properties of these functional metrics which have been defined either with the LIP-additive law, or with the LIP-multiplicative law. The first metric, which is LIP-additive, is robust to lighting variations caused by a change in the exposure time of the camera or in the source intensity. The second one, which is LIP-multiplicative, is robust to lighting variations due to changes of opacity of the captured object. These metrics are useful for pattern matching, thanks to distance maps between a reference function and an image. I have established the link between these maps of Asplund distances and Mathematical Morphology. Then, I have created the new framework of Logarithmic Mathematical Morphology which is based on the fundamental operations of erosion and dilation with the LIP-additive law. This gives them the interesting property of being adaptive to lighting variations caused by a change of camera exposure-time. Other operators, robust to these variations, have been defined in this novel framework. The latter and the LIP model have allowed the introduction of segmentation methods robust to lighting variations. Percolation techniques have also been studied with logarithmic colour contrasts. In addition, I have conceived and made a three-dimensional and multimodal acquisition system by stereovision, in the domain of the visible light and in the one of X-rays for tyre imaging. The complete chain has been studied and validated, including stereoscopic acquisition, calibration of the acquisition system and 3D reconstruction. In addition, in order to compare images of radial cuts of tyres with their 2D plan, a deformable registration method based on cubic B-splines has been successfully tested. For the longitudinal (i.e., temporal) analysis of eye fundus images, I have introduced a superimposition model made of an affine homography (i.e., a rotation, a translation and an anisotropic scaling) and one or two corrections of radial distortions depending on the number of cameras used for the acquisition. Compared to other state-of-the-art methods, in series of image pairs from public health databases captured with an interval of a year, the proposed approach gives better results. These research works form my contribution to the analysis of large image databases in industry or in medicine., Avec l’utilisation de plus en plus massive de la photographie numérique, de grandes banques de données d’images ont été constituées durant ces dernières décennies tant dans l’industrie que pour la médecine. Cependant, les images ont généralement été acquises avec différentes conditions d’éclairement, poses de l’appareil photographique et types d’appareil. Ceci cause des difficultés de comparaison entre les images d’une même scène. J’ai proposé des méthodes pour l’analyse de ces grandes banques de données qui répondent à cette problématique. Elles ont été appliquées dans le domaine des pneumatiques pour le contrôle de leur aspect visuel et l’étude de leur performancesainsi que dans celui de la santé pour le diagnostic et le suivi de la rétinopathie diabétique. Ces méthodes appartiennent aux cadres de travail de la Morphologie Mathématique et du Logarithmic Image Processing (LIP). Grâce à des collaborations académiques, j’ai développé des approches de segmentation morphologiques et de classification par apprentissage statistique. Elles ont été utilisées pour l’analyse d’images texturées et tridimensionnelles de la surface des pneumatiques, à des fins de recherche de défauts. Elles ont aussi servi à l’analyse automatique des images du fond d’œil chez les patients atteints de diabète. Du fait de la grande quantité de données à analyser, j’ai amélioré des algorithmes tant au niveau de leur complexité qu’au niveau de leur programmation efficace. Le programme peut être exécuté en parallèle sur plusieurs cœurs de processeurs et écritde manière vectorielle (i.e. plusieurs nombres traités à la fois au lieu d’un seul). Grâce au modèle LIP, j’ai proposé des méthodes d’amélioration des contrastes couleur des images rétiniennes. J’ai étudié les propriétés des métriques fonctionnelles d’Asplund définies soit avec la loi LIP-additive, soit avec la loi LIP-multiplicative. La première métrique, qui est LIP-additive, est robuste aux variations d’éclairement causées par un changement du temps d’exposition de la caméra ou de l’intensité de la source. La seconde, qui est LIP-multiplicative, est robuste aux variations d’éclairement dues à un changement d’opacité de l’objet capturé. Ces métriques sont utiles à la reconnaissance des formes, grâce aux cartes de distances entre une fonction de référence et une image.J’ai établi le lien entre ces cartes de distances d’Asplund et la Morphologie Mathématique. Puis, j’ai créé le nouveau cadre de la Morphologie Mathématique Logarithmique qui consiste à définir les opérations fondamentales d’érosion et de dilatation avec la loi LIP-additive. Ceci leur donne la propriété intéressante d’être adaptatives aux variations d’éclairement causées par un changementdu temps d’exposition de la caméra. D’autres opérateurs, robustes à ces variations, ont été définis dans ce nouveau cadre de travail. Celui-ci et le modèle LIP ont permis l’introduction de méthodes de segmentation robustes aux variations d’éclairement. Des techniques de percolations ont aussi été étudiées avec des contrastes logarithmiques couleur. En outre, j’ai conçu et réalisé un système d’acquisitions tridimensionnelles et multimodales par stéréovision dans le domaine de la lumière visible et dans celui des rayons X pour l’imagerie des pneumatiques. La chaîne complète comprenant l’acquisition stéréoscopique, le calibrage des systèmes d’acquisition et la reconstruction 3D a été étudiée et validée. De plus, afin de comparer les images de coupes radiales de pneumatiques avec leurs épures (i.e. leurs plans 2D), une méthode de recalage déformable à partir de B-splines cubiques a été testée avec succès. Pour l’analyse longitudinale (i.e. temporelle) des images du fond d’œil, j’ai introduit un nouveau modèle de superposition composé d’une homographie affine (i.e. une rotation, une translation et une mise à l’échelle anisotrope) et d’une ou deux corrections de distorsions radiales selon le nombre de caméras employées lors de l’acquisition. Comparée à plusieurs méthodes de l’état de l’art, sur des séries de paires d’images provenant de banques dedonnées de santé publique acquises avec une année d’intervalle, l’approche proposée donne de meilleurs résultats. Ces travaux de recherche constituent ma contribution à l’analyse des grandes banques d’images dans l’industrie ou en médecine.
- Published
- 2021
15. Where are my clothes? A multi-level approach for evaluating deep instance segmentation architectures on fashion images
- Author
-
Aurélie Bugeau, Warren Jouanneau, Nicolas Papadakis, Marc Palyart, Laurent Vezard, Lectra, Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Université de Bordeaux (UB), Institut de Mathématiques de Bordeaux (IMB), Université Bordeaux Segalen - Bordeaux 2-Université Sciences et Technologies - Bordeaux 1-Université de Bordeaux (UB)-Institut Polytechnique de Bordeaux (Bordeaux INP)-Centre National de la Recherche Scientifique (CNRS), and Jouanneau, Warren
- Subjects
business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Context (language use) ,Image segmentation ,Clothing ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition (psychology) ,Segmentation ,Artificial intelligence ,State (computer science) ,business - Abstract
International audience; In this paper we present an extensive evaluation of instance segmentation in the context of images containing clothes. We propose a multi level evaluation that completes the classical overlapping criteria given by IoU. In particular, we quantify both the contour and color content accuracy of the the predicted segmentation masks. We demonstrate that the proposed evaluation framework is relevant to obtain meaningful insights on models performance through experiments conducted on five state of the art instance segmentation methods.
- Published
- 2021
16. Roses are Red, Violets are Blue… But Should VQA expect Them To?
- Author
-
Grigory Antipov, Christian Wolf, Corentin Kervadec, Moez Baccouche, Orange Labs [Cesson-Sévigné], Orange Labs, Extraction de Caractéristiques et Identification (imagine), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), and Kervadec, Corentin
- Subjects
FOS: Computer and information sciences ,Exploit ,business.industry ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020207 software engineering ,02 engineering and technology ,Measure (mathematics) ,Visualization ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Knowledge extraction ,Metric (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Question answering ,Artificial intelligence ,business ,Diversity (business) - Abstract
Visual Question Answering (VQA) models are notorious for their tendency to rely on dataset biases.The large and unbalanced diversity of questions and concepts involved in VQA and the lack of high standard annotated data tend to prevent models from learning to `reason', leading them to perform `educated guesses' instead, relying on specific training set statistics, which is not helpful for generalizing to real world scenarios.In this paper, we claim that the standard evaluation metric, which consists in measuring the overall in-domain accuracy is misleading. Questions and concepts being unequally distributed, it tends to favor models which exploit subtle training set statistics.Alternatively, naively evaluating generalization by introducing artificial distribution shift between train and test splits is also not completely satisfying. First, the shifts do not reflect real words tendencies, resulting in unsuitable models; second, since the shifts are artificially handcrafted, trained models are specifically designed for this particular setting, and paradoxically do not generalize to other configurations.We propose the GQA-OOD benchmark designed to overcome these concerns: we measure and compare accuracy over, both, rare and frequent question-answer pairs and argue that the former is better suited to the evaluation of reasoning abilities, which we experimentally validate with models trained to more or less exploit biases. In a large-scale study involving 7 VQA models and 3 bias reduction techniques, we also experimentally demonstrate that these models fail to address questions involving infrequent concepts and provide recommendations for future directions of research.
- Published
- 2021
17. PLOP: Learning without Forgetting for Continual Semantic Segmentation
- Author
-
Arnaud Dapogny, Yifu Chen, Arthur Douillard, Matthieu Cord, Heuritech, Machine Learning and Information Access (MLIA), LIP6, Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Datakalab, and Douillard, Arthur
- Subjects
Scheme (programming language) ,FOS: Computer and information sciences ,Forgetting ,Computer science ,business.industry ,Deep learning ,Computer Vision and Pattern Recognition (cs.CV) ,Pooling ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,02 engineering and technology ,Semantics ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,0302 clinical medicine ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Entropy (information theory) ,020201 artificial intelligence & image processing ,Segmentation ,Artificial intelligence ,business ,computer ,computer.programming_language - Abstract
Deep learning approaches are nowadays ubiquitously used to tackle computer vision tasks such as semantic segmentation, requiring large datasets and substantial computational power. Continual learning for semantic segmentation (CSS) is an emerging trend that consists in updating an old model by sequentially adding new classes. However, continual learning methods are usually prone to catastrophic forgetting. This issue is further aggravated in CSS where, at each step, old classes from previous iterations are collapsed into the background. In this paper, we propose Local POD, a multi-scale pooling distillation scheme that preserves long- and short-range spatial relationships at feature level. Furthermore, we design an entropy-based pseudo-labelling of the background w.r.t. classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes. Our approach, called PLOP, significantly outperforms state-of-the-art methods in existing CSS scenarios, as well as in newly proposed challenging benchmarks., Comment: Accepted at CVPR 2021, code: https://github.com/arthurdouillard/CVPR2021_PLOP
- Published
- 2021
18. Single-view robot pose and joint angle estimation via render & compare
- Author
-
Justin Carpentier, Josef Sivic, Mathieu Aubry, Yann Labbé, Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS), Models of visual object recognition and scene understanding (WILLOW), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire d'Informatique Gaspard-Monge (LIGM), École des Ponts ParisTech (ENPC)-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel, Czech Institute of Informatics, Robotics and Cybernetics [Prague] (CIIRC), Czech Technical University in Prague (CTU), This work was partially supported by the HPC resources from GENCI-IDRIS (Grant 011011181R1), the European Regional Development Fund under the project IMPACT (reg. no. CZ.02.1.01/0.0/0.0/15 003/0000468), Louis Vuitton ENS Chair on Artificial In-telligence, and the French government under management of Agence Nationale de la Recherche as part of the 'Investissements d’avenir' program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute)., ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), Labbé, Yann, and PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID
- Subjects
FOS: Computer and information sciences ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO] ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Context (language use) ,Degrees of freedom (mechanics) ,Synthetic data ,Visualization ,Computer Science - Robotics ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Articulated robot ,Benchmark (computing) ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,Robot ,Artificial intelligence ,business ,Robotics (cs.RO) ,Parametrization - Abstract
We introduce RoboPose, a method to estimate the joint angles and the 6D camera-to-robot pose of a known articulated robot from a single RGB image. This is an important problem to grant mobile and itinerant autonomous systems the ability to interact with other robots using only visual information in non-instrumented environments, especially in the context of collaborative robotics. It is also challenging because robots have many degrees of freedom and an infinite space of possible configurations that often result in self-occlusions and depth ambiguities when imaged by a single camera. The contributions of this work are three-fold. First, we introduce a new render & compare approach for estimating the 6D pose and joint angles of an articulated robot that can be trained from synthetic data, generalizes to new unseen robot configurations at test time, and can be applied to a variety of robots. Second, we experimentally demonstrate the importance of the robot parametrization for the iterative pose updates and design a parametrization strategy that is independent of the robot structure. Finally, we show experimental results on existing benchmark datasets for four different robots and demonstrate that our method significantly outperforms the state of the art. Code and pre-trained models are available on the project webpage https://www.di.ens.fr/willow/research/robopose/., Accepted at CVPR 2021 (Oral)
- Published
- 2021
19. Subsequent Keyframe Generation for Visual Servoing
- Author
-
Jocelyn Buisson, Nathan Crombez, Zhi Yan, Yassine Ruichek, Connaissance et Intelligence Artificielle Distribuées [Dijon] (CIAD), Université de Technologie de Belfort-Montbeliard (UTBM)-Université de Bourgogne (UB), and Crombez, Nathan
- Subjects
Rest (physics) ,Service robot ,0209 industrial biotechnology ,Computer science ,business.industry ,Generalization ,[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO] ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,02 engineering and technology ,Object (computer science) ,Visual servoing ,Automation ,Visualization ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020901 industrial engineering & automation ,Robustness (computer science) ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,Computer vision ,Artificial intelligence ,business - Abstract
International audience; In this paper, we study the problem of autonomous and reliable positioning of a camera w.r.t. an object when only this latter is known but not the rest of the scene. We propose to combine the advantages and efficiency of a visual servoing scheme and the generalization ability of a generative adversarial network. The paper describes how to efficiently create a synthetic dataset in order to train a network that predicts an intermediate visual keyframe between two images. Subsequent predictions are used as visual features to autonomously converge towards the desired pose even for large displacements. We show that the proposed method can be used without any prior knowledge on the scene appearance except for the object itself, while being robust to various lighting conditions and specular surfaces. We provide experimental results, both in simulation and using a real service robot platform to validate and evaluate the effectiveness, robustness, and accuracy of our approach.
- Published
- 2021
20. A 3D Omnidirectional Sensor For Mobile Robot Applications
- Author
-
Xavier Savatier, Belahcene Mazari, Rémi Boutteau, Jean-Yves Ertaud, Pôle Instrumentation, Informatique et Systèmes, Institut de Recherche en Systèmes Electroniques Embarqués (IRSEEM), Université de Rouen Normandie (UNIROUEN), Normandie Université (NU)-Normandie Université (NU)-École Supérieure d’Ingénieurs en Génie Électrique (ESIGELEC)-Université de Rouen Normandie (UNIROUEN), Normandie Université (NU)-Normandie Université (NU)-École Supérieure d’Ingénieurs en Génie Électrique (ESIGELEC), and Boutteau, Rémi
- Subjects
0209 industrial biotechnology ,business.industry ,Computer science ,Optical flow ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Mobile robot ,Robotics ,02 engineering and technology ,[INFO] Computer Science [cs] ,Simultaneous localization and mapping ,Catadioptric system ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020901 industrial engineering & automation ,Motion estimation ,0202 electrical engineering, electronic engineering, information engineering ,Robot ,Structure from motion ,020201 artificial intelligence & image processing ,Computer vision ,[INFO]Computer Science [cs] ,Artificial intelligence ,business ,ComputingMilieux_MISCELLANEOUS - Abstract
In most of the missions a mobile robot has to achieve – intervention in hostile environments, preparation of military intervention, mapping, etc – two main tasks have to be completed: navigation and 3D environment perception. Therefore, vision based solutions have been widely used in autonomous robotics because they provide a large amount of information useful for detection, tracking, pattern recognition and scene understanding. Nevertheless, the main limitations of this kind of system are the limited field of view and the loss of the depth perception. A 360-degree field of view offers many advantages for navigation such as easiest motion estimation using specific properties on optical flow (Mouaddib, 2005) and more robust feature extraction and tracking. The interest for omnidirectional vision has therefore been growing up significantly over the past few years and several methods are being explored to obtain a panoramic image: rotating cameras (Benosman & Devars, 1998), muti-camera systems and catadioptric sensors (Baker & Nayar, 1999). Catadioptric sensors, i.e. the combination of a camera and a mirror with revolution shape, are nevertheless the only system that can provide a panoramic image instantaneously without moving parts, and are thus well-adapted for mobile robot applications. The depth perception can be retrieved using a set of images taken from at least two different viewpoints either by moving the camera or by using several cameras at different positions. The use of the camera motion to recover the geometrical structure of the scene and the camera’s positions is known as Structure From Motion (SFM). Excellent results have been obtained during the last years with SFM approaches (Pollefeys et al., 2004; Nister, 2001), but with off-line algorithms that need to process all the images simultaneous. SFM is consequently not well-adapted to the exploration of an unknown environment because the robot needs to build the map and to localize itself in this map during its world exploration. The in-line approach, known as SLAM (Simultaneous Localization and Mapping), is one of the most active research areas in robotics since it can provide a real autonomy to a mobile robot. Some interesting results have been obtained in the last few years but principally to build 2D maps of indoor environments using laser range-finders. A survey of these algorithms can be found in the tutorials of Durrant-Whyte and Bailey (Durrant-Whyte & Bailey, 2006; Bailey & Durrant-Whyte, 2006). 1
- Published
- 2021
21. Robust Rational Polynomial Camera Modelling for SAR and Pushbroom Imaging
- Author
-
Gabriele Facciolo, Roger Mari, Carlo de Franchis, Roland Akiki, Jean-Michel Morel, CB - Centre Borelli - UMR 9010 (CB), Service de Santé des Armées-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Ecole Normale Supérieure Paris-Saclay (ENS Paris Saclay)-Université Paris Cité (UPCité), Centre de Mathématiques et de Leurs Applications (CMLA), École normale supérieure - Cachan (ENS Cachan)-Centre National de la Recherche Scientifique (CNRS), Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra [Barcelona] (UPF), Akiki, Roland, and Service de Santé des Armées-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Ecole Normale Supérieure Paris-Saclay (ENS Paris Saclay)-Université de Paris (UP)
- Subjects
Synthetic aperture radar ,FOS: Computer and information sciences ,010504 meteorology & atmospheric sciences ,Computer science ,Physics::Instrumentation and Detectors ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,0211 other engineering and technologies ,02 engineering and technology ,Rational polynomial ,01 natural sciences ,Set (abstract data type) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,FOS: Electrical engineering, electronic engineering, information engineering ,Point (geometry) ,Computer vision ,Image sensor ,Adaptive optics ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences ,business.industry ,Image and Video Processing (eess.IV) ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Electrical Engineering and Systems Science - Image and Video Processing ,Cover (topology) ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Satellite ,Artificial intelligence ,business - Abstract
The Rational Polynomial Camera (RPC) model can be used to describe a variety of image acquisition systems in remote sensing, notably optical and Synthetic Aperture Radar (SAR) sensors. RPC functions relate 3D to 2D coordinates and vice versa, regardless of physical sensor specificities, which has made them an essential tool to harness satellite images in a generic way. This article describes a terrain-independent algorithm to accurately derive a RPC model from a set of 3D-2D point correspondences based on a regularized least squares fit. The performance of the method is assessed by varying the point correspondences and the size of the area that they cover. We test the algorithm on SAR and optical data, to derive RPCs from physical sensor models or from other RPC models after composition with corrective functions.
- Published
- 2021
22. Deep learning for brain disorders: from data processing to disease treatment
- Author
-
Elina Thibeau-Sutre, Ninon Burgos, Olivier Colliot, Simona Bottani, Johann Faouzi, Algorithms, models and methods for images and signals of the human brain (ARAMIS), Sorbonne Université (SU)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), The research leading to these results has received funding from the French government under management of Agence Nationale de la Recherche as part of the 'Investissements d'avenir' program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute) and reference ANR-10-IAIHU-06 (Agence Nationale de la Recherche-10-IA Institut Hospitalo-Universitaire-6), from the ICM Big Brain Theory Program (project PredictICD), and from the Abeona Foundation (project Brain@Scale)., ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), Colliot, Olivier, PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], and Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
medicine.medical_specialty ,Neurology ,Computer science ,[SDV.IB.IMA]Life Sciences [q-bio]/Bioengineering/Imaging ,[SDV.NEU.NB]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,education ,[INFO.INFO-IM] Computer Science [cs]/Medical Imaging ,030218 nuclear medicine & medical imaging ,Environmental data ,Diagnosis, Differential ,03 medical and health sciences ,0302 clinical medicine ,Quality of life (healthcare) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Medical imaging ,medicine ,[INFO.INFO-IM]Computer Science [cs]/Medical Imaging ,Humans ,Precision Medicine ,Molecular Biology ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing ,Data processing ,Brain Diseases ,Modalities ,business.industry ,Deep learning ,[SDV.NEU.NB] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Genomics ,Precision medicine ,Data science ,3. Good health ,Treatment Outcome ,[SDV.IB.IMA] Life Sciences [q-bio]/Bioengineering/Imaging ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Disease Progression ,Artificial intelligence ,Smartphone ,business ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,030217 neurology & neurosurgery ,Information Systems - Abstract
In order to reach precision medicine and improve patients’ quality of life, machine learning is increasingly used in medicine. Brain disorders are often complex and heterogeneous, and several modalities such as demographic, clinical, imaging, genetics and environmental data have been studied to improve their understanding. Deep learning, a subpart of machine learning, provides complex algorithms that can learn from such various data. It has become state of the art in numerous fields, including computer vision and natural language processing, and is also growingly applied in medicine. In this article, we review the use of deep learning for brain disorders. More specifically, we identify the main applications, the concerned disorders and the types of architectures and data used. Finally, we provide guidelines to bridge the gap between research studies and clinical routine.
- Published
- 2021
23. CoMoGAN: continuous model-guided image-to-image translation
- Author
-
Fabio Pizzati, Pietro Cerri, Raoul de Charette, Pizzati, Fabio, Robotics & Intelligent Transportation Systems (RITS), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), and VisLab
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Continuous modelling ,Computer science ,business.industry ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Normalization (image processing) ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,02 engineering and technology ,Translation (geometry) ,Image (mathematics) ,Machine Learning (cs.LG) ,Artificial Intelligence (cs.AI) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Position (vector) ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,Image translation ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. To that matter, we introduce a new Functional Instance Normalization layer and residual mechanism, which together disentangle image content from position on target manifold. We rely on naive physics-inspired models to guide the training while allowing private model/translations features. CoMoGAN can be used with any GAN backbone and allows new types of image translation, such as cyclic image translation like timelapse generation, or detached linear translation. On all datasets, it outperforms the literature. Our code is available at http://github.com/cv-rits/CoMoGAN ., CVPR 2021 oral
- Published
- 2021
24. Apprentissage semi-supervisé de dictionnaire et de réseaux de neurones profonds
- Author
-
Tran, Khanh-Hung, Intelligence Artificielle et Apprentissage Automatique (LI3A), Département Métrologie Instrumentation & Information (DM2I), Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay, Astrophysique Interprétation Modélisation (AIM (UMR_7158 / UMR_E_9005 / UM_112)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Institut national des sciences de l'Univers (INSU - CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Université Paris-Saclay, Jean-Luc Starck, Fred Maurice Ngolè Mboula, Laboratoire d'Intégration des Systèmes et des Technologies (LIST), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Institut national des sciences de l'Univers (INSU - CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP), and STAR, ABES
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Apprentissage profond ,Apprentissage de variétés ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,online learning ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Deep learning ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,artificial intelligence ,Dictionary Learning ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Manifold learning ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,machine learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Apprentissage de dictionnaire ,Adversarial learning ,Apprentissage antagoniste - Abstract
Since the 2010's, machine learning (ML) has been one of the topics that attract a lot of attention from scientific researchers. Many ML models have been demonstrated their ability to produce excellent results in various fields such as Computer Vision, Natural Language Processing, Robotics... However, most of these models use supervised learning, which requires a massive annotation. Therefore, the objective of this thesis is to study and to propose semi-supervised learning approaches that have many advantages over supervised learning. Instead of directly applying a semi-supervised classifier on the original representation of data, we rather use models that integrate a representation learning stage before the classification stage, to better adapt to the non-linearity of the data. In the first step, we revisit tools that allow us to build our semi-supervised models. First, we present two types of model that possess representation learning in their architecture: dictionary learning and neural network, as well as the optimization methods for each type of model. Moreover, in the case of neural network, we specify the problem with adversarial examples. Then, we present the techniques that often accompany with semi-supervised learning such as variety learning and pseudo-labeling. In the second part, we work on dictionary learning. We synthesize generally three steps to build a semi-supervised model from a supervised model. Then, we propose our semi-supervised model to deal with the classification problem typically in the case of a low number of training samples (including both labelled and non-labelled samples). On the one hand, we apply the preservation of the data structure from the original space to the sparse code space (manifold learning), which is considered as regularization for sparse codes. On the other hand, we integrate a semi-supervised classifier in the sparse code space. In addition, we perform sparse coding for test samples by taking into account also the preservation of the data structure. This method provides an improvement on the accuracy rate compared to other existing methods. In the third step, we work on neural network models. We propose an approach called "manifold attack" which allows reinforcing manifold learning. This approach is inspired from adversarial learning : finding virtual points that disrupt the cost function on manifold learning (by maximizing it) while fixing the model parameters; then the model parameters are updated by minimizing this cost function while fixing these virtual points. We also provide criteria for limiting the space to which the virtual points belong and the method for initializing them. This approach provides not only an improvement on the accuracy rate but also a significant robustness to adversarial examples. Finally, we analyze the similarities and differences, as well as the advantages and disadvantages between dictionary learning and neural network models. We propose some perspectives on both two types of models. In the case of semi-supervised dictionary learning, we propose some techniques inspired by the neural network models. As for the neural network, we propose to integrate manifold attack on generative models., Depuis les années 2010, l’apprentissage automatique (ML) est l’un des sujets qui retient beaucoup l'attention des chercheurs scientifiques. De nombreux modèles de ML ont démontré leur capacité produire d’excellent résultats dans des divers domaines comme Vision par ordinateur, Traitement automatique des langues, Robotique… Toutefois, la plupart de ces modèles emploient l’apprentissage supervisé, qui requiert d’un massive annotation. Par conséquent, l’objectif de cette thèse est d’étudier et de proposer des approches semi-supervisées qui ont plusieurs avantages par rapport à l’apprentissage supervisé. Au lieu d’appliquer directement un classificateur semi-supervisé sur la représentation originale des données, nous utilisons plutôt des types de modèle qui intègrent une phase de l’apprentissage de représentation avant de la phase de classification, pour mieux s'adapter à la non linéarité des données. Dans le premier temps, nous revisitons des outils qui permettent de construire notre modèles semi-supervisés. Tout d’abord, nous présentons deux types de modèle qui possèdent l’apprentissage de représentation dans leur architecture : l’apprentissage de dictionnaire et le réseau de neurones, ainsi que les méthodes d’optimisation pour chaque type de model, en plus, dans le cas de réseau de neurones, nous précisons le problème avec les exemples contradictoires. Ensuite, nous présentons les techniques qui accompagnent souvent avec l’apprentissage semi-supervisé comme l’apprentissage de variétés et le pseudo-étiquetage. Dans le deuxième temps, nous travaillons sur l’apprentissage de dictionnaire. Nous synthétisons en général trois étapes pour construire un modèle semi-supervisée à partir d’un modèle supervisé. Ensuite, nous proposons notre modèle semi-supervisée pour traiter le problème de classification typiquement dans le cas d’un faible nombre d’échantillons d’entrainement (y compris tous labellisés et non labellisés échantillons). D'une part, nous appliquons la préservation de la structure de données de l’espace original à l’espace de code parcimonieux (l’apprentissage de variétés), ce qui est considéré comme la régularisation pour les codes parcimonieux. D'autre part, nous intégrons un classificateur semi-supervisé dans l’espace de code parcimonieux. En outre, nous effectuons le codage parcimonieux pour les échantillons de test en prenant en compte aussi la préservation de la structure de données. Cette méthode apporte une amélioration sur le taux de précision par rapport à des méthodes existantes. Dans le troisième temps, nous travaillons sur le réseau de neurones. Nous proposons une approche qui s’appelle "manifold attack" qui permets de renforcer l’apprentissage de variétés. Cette approche est inspirée par l’apprentissage antagoniste : trouver des points virtuels qui perturbent la fonction de coût sur l’apprentissage de variétés (en la maximisant) en fixant les paramètres du modèle; ensuite, les paramètres du modèle sont mis à jour, en minimisant cette fonction de coût et en fixant les points virtuels. Nous fournissons aussi des critères pour limiter l’espace auquel les points virtuels appartiennent et la méthode pour les initialiser. Cette approche apporte non seulement une amélioration sur le taux de précision mais aussi une grande robustesse contre les exemples contradictoires. Enfin, nous analysons des similarités et des différences, ainsi que des avantages et inconvénients entre l’apprentissage de dictionnaire et le réseau de neurones. Nous proposons quelques perspectives sur ces deux types de modèle. Dans le cas de l’apprentissage de dictionnaire semi-supervisé, nous proposons quelques techniques en inspirant par le réseau de neurones. Quant au réseau de neurones, nous proposons d’intégrer "manifold attack" sur les modèles génératifs.
- Published
- 2021
25. Active region detection in multi-spectral solar images
- Author
-
Xianghua Xie, Jean Aboudarham, Adeline Paiement, Majedaldein Almahasneh, Department of Computer Science [Swansea], Swansea University, DYNamiques de l’Information (DYNI), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Observatoire de Paris, Université Paris sciences et lettres (PSL), Laboratoire d'études spatiales et d'instrumentation en astrophysique = Laboratory of Space Studies and Instrumentation in Astrophysics (LESIA), Institut national des sciences de l'Univers (INSU - CNRS)-Observatoire de Paris, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Paiement, Adeline, and Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[PHYS.ASTR.IM]Physics [physics]/Astrophysics [astro-ph]/Instrumentation and Methods for Astrophysic [astro-ph.IM] ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer science ,Joint Analysis ,Multi-spectral Images ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Coronal hole ,Multi spectral ,02 engineering and technology ,Space weather ,Solar Images ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Image (mathematics) ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Set (abstract data type) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Active Regions ,Modality (human–computer interaction) ,[PHYS.ASTR.SR] Physics [physics]/Astrophysics [astro-ph]/Solar and Stellar Astrophysics [astro-ph.SR] ,business.industry ,Deep learning ,Region detection ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020207 software engineering ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[PHYS.ASTR.SR]Physics [physics]/Astrophysics [astro-ph]/Solar and Stellar Astrophysics [astro-ph.SR] ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,020201 artificial intelligence & image processing ,Artificial intelligence ,[PHYS.ASTR.IM] Physics [physics]/Astrophysics [astro-ph]/Instrumentation and Methods for Astrophysic [astro-ph.IM] ,business - Abstract
International audience; Precisely detecting solar Active Regions (AR) from multi-spectral images is a challenging task yet important in understanding solar activity and its influence on space weather. A main challenge comes from each modality capturing a different location of these 3D objects, as opposed to more traditional multi-spectral imaging scenarios where all image bands observe the same scene. We present a multi-task deep learning framework that exploits the dependencies between image bands to produce 3D AR detection where different image bands (and physical locations) each have their own set of results. We compare our detection method against baseline approaches for solar image analysis (multi-channel coronal hole detection, SPOCA for ARs (Verbeeck et al., 2013)) and a state-of-the-art deep learning method (Faster RCNN) and show enhanced performances in detecting ARs jointly from multiple bands.
- Published
- 2021
26. DR2S : Deep Regression with Region Selection for Camera Quality Evaluation
- Author
-
Stéphane Lathuilière, Marcelin Tworski, Salim Belkarfa, Marco Cagnazzo, Attilio Fiandrotti, Multimédia (MM), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Images, Données, Signal (IDS), Télécom ParisTech, DSAIDIS, and Lathuilière, Stéphane
- Subjects
FOS: Computer and information sciences ,Measure (data warehouse) ,Computer science ,business.industry ,media_common.quotation_subject ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,02 engineering and technology ,Texture (music) ,01 natural sciences ,Regression ,Image (mathematics) ,010309 optics ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Perception ,0103 physical sciences ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Artificial intelligence ,business ,media_common - Abstract
International audience; In this work, we tackle the problem of estimating a camera capability to preserve fine texture details at a given lighting condition. Importantly, our texture preservation measurement should coincide with human perception. Consequently, we formulate our problem as a regression one and we introduce a deep convolutional network to estimate texture quality score. At training time, we use ground-truth quality scores provided by expert human annotators in order to obtain a subjective quality measure. In addition, we propose a region selection method to identify the image regions that are better suited at measuring perceptual quality. Finally, our experimental evaluation shows that our learning-based approach outperforms existing methods and that our region selection algorithm consistently improves the quality estimation.
- Published
- 2021
27. Generating Private Data Surrogates for Vision Related Tasks
- Author
-
Julien Rabin, Ryan Webster, Loic Simon, Frédéric Jurie, Equipe Image - Laboratoire GREYC - UMR6072, Groupe de Recherche en Informatique, Image et Instrumentation de Caen (GREYC), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU)-Normandie Université (NU)-Université de Caen Normandie (UNICAEN), Normandie Université (NU)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU), Financement de la Région Normandie RIN NormanD’eep, IAPR, ANR-16-CE23-0006,Deep_in_France,Réseaux de neurones profonds pour l'apprentissage(2016), Rabin, Julien, and Réseaux de neurones profonds pour l'apprentissage - - Deep_in_France2016 - ANR-16-CE23-0006 - AAPG2016 - VALID
- Subjects
Information privacy ,Computer science ,media_common.quotation_subject ,Inference ,02 engineering and technology ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Surrogate data ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,0105 earth and related environmental sciences ,media_common ,business.industry ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Construct (python library) ,Variety (cybernetics) ,Task (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
this work has been also presented in SPML19, ICML Workshop on Security and Privacy of Machine Learning (2019-06-14), Long Beach, California, USA; International audience; With the widespread application of deep networks in industry, membership inference attacks, i.e. the ability to discern training data from a model, become more and more problematic for data privacy. Recent work suggests that generative networks may be robust against membership attacks. In this work, we build on this observation, offering a general-purpose solution to the membership privacy problem. As the primary contribution, we demonstrate how to construct surrogate datasets, using images from GAN generators, labelled with a classifier trained on the private dataset. Next, we show this surrogate data can further be used for a variety of downstream tasks (here classification and regression), while being resistant to membership attacks. We study a variety of different GANs proposed in the literature, concluding that higher quality GANs result in better surrogate data with respect to the task at hand.
- Published
- 2021
28. Automatic Estimation of Self-Reported Pain by Interpretable Representations of Motion Dynamics
- Author
-
Mohamed Daoudi, Stefano Berretti, Pietro Pala, Zakia Hammal, Benjamin Szczapa, Alberto Del Bimbo, Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Dipartimento di Sistemi e Informatica (DSI), Università degli Studi di Firenze = University of Florence [Firenze] (UNIFI), Ecole nationale supérieure Mines-Télécom Lille Douai (IMT Lille Douai), Institut Mines-Télécom [Paris] (IMT), Carnegie Mellon University [Pittsburgh] (CMU), DAOUDI, Mohamed, Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189 (CRIStAL), Ecole Centrale de Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Università degli Studi di Firenze = University of Florence [Firenze], and Università degli Studi di Firenze = University of Florence (UniFI)
- Subjects
FOS: Computer and information sciences ,Motion dynamics ,Rank (linear algebra) ,Computer science ,Visual analogue scale ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,010103 numerical & computational mathematics ,02 engineering and technology ,Computer Science::Human-Computer Interaction ,01 natural sciences ,Article ,Intensity (physics) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,0101 mathematics ,business - Abstract
We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Curve fitting and temporal alignment were then used to smooth the extracted trajectories. A Support Vector Regression model was then trained to encode the extracted trajectories into ten pain intensity levels consistent with the Visual Analogue Scale for pain intensity measurement. The proposed approach was evaluated using the UNBC McMaster Shoulder Pain Archive and was compared to the state-of-the-art on the same data. Using both 5-fold cross-validation and leave-one-subject-out cross-validation, our results are competitive with respect to state-of-the-art methods., Comment: accepted at ICPR 2020 Conference
- Published
- 2021
29. Hierarchical Head Design for Object Detectors
- Author
-
Frédéric Jurie, Shivang Agarwal, Equipe Image - Laboratoire GREYC - UMR6072, Groupe de Recherche en Informatique, Image et Instrumentation de Caen (GREYC), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU)-Normandie Université (NU)-Université de Caen Normandie (UNICAEN), Normandie Université (NU)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Ingénieurs de Caen (ENSICAEN), Normandie Université (NU), DGA RAPID-DRAAF, and Jurie, Frederic
- Subjects
Computer science ,Computer Vision ,Feature extraction ,Inference ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,2D Object Detection ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Deep Learning ,Bounding overwatch ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,0105 earth and related environmental sciences ,business.industry ,Deep learning ,Detector ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,object detection ,Object (computer science) ,Object detection ,Anchors ,Pattern recognition (psychology) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
The notion of anchor plays a major role in modern detection algorithms such as the Faster-RCNN [1] or the SSD detector [2]. Anchors relate the features of the last layers of the detector with bounding boxes containing objects in images. Despite their importance, the literature on object detection has not paid real attention to them. The motivation of this paper comes from the observations that (i) each anchor learns to classify and regress candidate objects independently (ii) insufficient examples are available for each anchor in case of small-scale datasets. This paper addresses these questions by proposing a novel hierarchical head for the SSD detector. The new design has the added advantage of no extra weights, as compared to the original design at inference time, while improving detectors performance for small size training sets. Improved performance on PASCAL-VOC and state-of-the-art performance on FlickrLogos-47 validate the method. We also show when the proposed design does not give additional performance gain over the original design.
- Published
- 2021
30. P2D: a self-supervised method for depth estimation from polarimetry
- Author
-
Olivier Morel, Ralph Seulin, Daniel Braun, Désiré Sidibé, Marc Blanchon, Fabrice Meriaudeau, Equipe VIBOT - VIsion pour la roBOTique [ImViA EA7535 - ERL CNRS 6000] (VIBOT), Centre National de la Recherche Scientifique (CNRS)-Imagerie et Vision Artificielle [Dijon] (ImViA), Université de Bourgogne (UB)-Université de Bourgogne (UB), Informatique, BioInformatique, Systèmes Complexes (IBISC), Université d'Évry-Val-d'Essonne (UEVE)-Université Paris-Saclay, and Sidibé, Désiré
- Subjects
FOS: Computer and information sciences ,0209 industrial biotechnology ,Monocular ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Polarimetry ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Regularization (mathematics) ,Term (time) ,020901 industrial engineering & automation ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Specularity ,Robustness (computer science) ,Depth map ,Computer vision ,Artificial intelligence ,Transparency (data compression) ,business ,0105 earth and related environmental sciences - Abstract
Monocular depth estimation is a recurring subject in the field of computer vision. Its ability to describe scenes via a depth map while reducing the constraints related to the formulation of perspective geometry tends to favor its use. However, despite the constant improvement of algorithms, most methods exploit only colorimetric information. Consequently, robustness to events to which the modality is not sensitive to, like specularity or transparency, is neglected. In response to this phenomenon, we propose using polarimetry as an input for a self-supervised monodepth network. Therefore, we propose exploiting polarization cues to encourage accurate reconstruction of scenes. Furthermore, we include a term of polarimetric regularization to state-of-the-art method to take specific advantage of the data. Our method is evaluated both qualitatively and quantitatively demonstrating that the contribution of this new information as well as an enhanced loss function improves depth estimation results, especially for specular areas., 8 pages, submitted to ICPR2020 second round
- Published
- 2021
31. Polarimetric image augmentation
- Author
-
Olivier Morel, Ralph Seulin, Désiré Sidibé, Fabrice Meriaudeau, Marc Blanchon, Equipe VIBOT - VIsion pour la roBOTique [ImViA EA7535 - ERL CNRS 6000] (VIBOT), Centre National de la Recherche Scientifique (CNRS)-Imagerie et Vision Artificielle [Dijon] (ImViA), Université de Bourgogne (UB)-Université de Bourgogne (UB), Informatique, BioInformatique, Systèmes Complexes (IBISC), Université d'Évry-Val-d'Essonne (UEVE)-Université Paris-Saclay, and Sidibé, Désiré
- Subjects
FOS: Computer and information sciences ,0209 industrial biotechnology ,Augmentation procedure ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Deep learning ,Computer Science - Computer Vision and Pattern Recognition ,Polarimetry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,02 engineering and technology ,Image segmentation ,Convolutional neural network ,Data modeling ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020901 industrial engineering & automation ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Segmentation ,Artificial intelligence ,Specular reflection ,business - Abstract
Robotics applications in urban environments are subject to obstacles that exhibit specular reflections hampering autonomous navigation. On the other hand, these reflections are highly polarized and this extra information can successfully be used to segment the specular areas. In nature, polarized light is obtained by reflection or scattering. Deep Convolutional Neural Networks (DCNNs) have shown excellent segmentation results, but require a significant amount of data to achieve best performances. The lack of data is usually overcomed by using augmentation methods. However, unlike RGB images, polarization images are not only scalar (intensity) images and standard augmentation techniques cannot be applied straightforwardly. We propose to enhance deep learning models through a regularized augmentation procedure applied to polarimetric data in order to characterize scenes more effectively under challenging conditions. We subsequently observe an average of 18.1% improvement in IoU between non augmented and regularized training procedures on real world data., 7 pages, submitted to ICPR2020 second round
- Published
- 2021
32. ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos
- Author
-
Adrien Chan-Hon-Tong, Guillaume Vaudaux-Ruth, Catherine Achard, Institut des Systèmes Intelligents et de Robotique (ISIR), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Sorbonne Université (SU), Perception, Interaction, Robotique sociales (PIROS), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Sorbonne Université, DTIS, ONERA, Université Paris Saclay (COmUE) [Palaiseau], ONERA-Université Paris Saclay (COmUE), DTIS, ONERA, Université Paris Saclay [Palaiseau], ONERA-Université Paris-Saclay, Vaudaux-Ruth, Guillaume, Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU), and Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Index Terms-Class ,Computer science ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,IEEEtran ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Machine Learning (cs.LG) ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,typesetting ,Reinforcement learning algorithm ,ComputingMilieux_MISCELLANEOUS ,0105 earth and related environmental sciences ,Ground truth ,business.industry ,paper ,template ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Spotting ,style ,Artificial Intelligence (cs.AI) ,Action (philosophy) ,Pattern recognition (psychology) ,Key frame ,020201 artificial intelligence & image processing ,Artificial intelligence ,L A T E X ,business - Abstract
Summarizing video content is an important task in many applications. This task can be defined as the computation of the ordered list of actions present in a video. Such a list could be extracted using action detection algorithms. However, it is not necessary to determine the temporal boundaries of actions to know their existence. Moreover, localizing precise boundaries usually requires dense video analysis to be effective. In this work, we propose to directly compute this ordered list by sparsely browsing the video and selecting one frame per action instance, task known as action spotting in literature. To do this, we propose ActionSpotter, a spotting algorithm that takes advantage of Deep Reinforcement Learning to efficiently spot actions while adapting its video browsing speed, without additional supervision. Experiments performed on datasets THUMOS14 and ActivityNet show that our framework outperforms state of the art detection methods. In particular, the spotting mean Average Precision on THUMOS14 is significantly improved from 59.7% to 65.6% while skipping 23% of video.
- Published
- 2021
33. Kernel-based Graph Convolutional Networks
- Author
-
Hichem Sahbi, Machine Learning and Information Access (MLIA), LIP6, Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), and Sahbi, Hichem
- Subjects
Theoretical computer science ,Computational complexity theory ,Computer science ,02 engineering and technology ,010501 environmental sciences ,Overfitting ,01 natural sciences ,Kernel (linear algebra) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,0202 electrical engineering, electronic engineering, information engineering ,Representation (mathematics) ,0105 earth and related environmental sciences ,action recognition ,business.industry ,Deep learning ,Sorting ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,kernel machines ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Graph ,Kernel (image processing) ,020201 artificial intelligence & image processing ,Node (circuits) ,Artificial intelligence ,graph convolutional networks ,business ,Reproducing kernel Hilbert space - Abstract
International audience; Learning graph convolutional networks (GCNs) is an emerging field which aims at generalizing deep learning to arbitrary non-regular domains. Most of the existing GCNs follow a neighborhood aggregation scheme, where the representation of a node is recursively obtained by aggregating its neighboring node representations using averaging or sorting operations. However, these operations are either ill-posed or weak to be discriminant or increase the number of training parameters and thereby the computational complexity and the risk of overfitting. In this paper, we introduce a novel GCN framework that achieves spatial graph convolution in a reproducing kernel Hilbert space. The latter makes it possible to design, via implicit kernel representations, convolutional graph filters in a high dimensional and more discriminating space without increasing the number of training parameters. The particularity of our GCN model also resides in its ability to achieve convolutions without explicitly realigning nodes in the receptive fields of the learned graph filters with those of the input graphs, thereby making convolutions permutation agnostic and well defined. Experiments conducted on the challenging task of skeleton-based action recognition show the superiority of the proposed method against different baselines as well as the related work.
- Published
- 2021
34. Incorporating depth information into few-shot semantic segmentation
- Author
-
Désiré Sidibé, Fabrice Meriaudeau, Olivier Morel, Yifei Zhang, Equipe VIBOT - VIsion pour la roBOTique [ImViA EA7535 - ERL CNRS 6000] (VIBOT), Centre National de la Recherche Scientifique (CNRS)-Imagerie et Vision Artificielle [Dijon] (ImViA), Université de Bourgogne (UB)-Université de Bourgogne (UB), Informatique, BioInformatique, Systèmes Complexes (IBISC), Université d'Évry-Val-d'Essonne (UEVE)-Université Paris-Saclay, Joint MSc in VIsion and RoBOTics [VIBOT] (Master VIBOT), Université de Bourgogne (UB), and Sidibé, Désiré
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Artificial neural network ,Computer science ,business.industry ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020206 networking & telecommunications ,02 engineering and technology ,Image segmentation ,Semantics ,Visualization ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Metric (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,RGB color model ,Segmentation ,Computer vision ,Artificial intelligence ,business - Abstract
International audience; Few-shot segmentation presents a significant challengefor semantic scene understanding under limited supervision.Namely, this task targets at generalizing the segmentationability of the model to new categories given a few samples.In order to obtain complete scene information, we extend theRGB-centric methods to take advantage of complementary depthinformation. In this paper, we propose a two-stream deep neuralnetwork based on metric learning. Our method, known as RDNet,learns class-specific prototype representations within RGB anddepth embedding spaces, respectively. The learned prototypesprovide effective semantic guidance on the corresponding RGBand depth query image, leading to more accurate performance.Moreover, we build a novel outdoor scene dataset, known asCityscapes-3i, using labeled RGB images and depth imagesfrom the Cityscapes dataset. We also perform ablation studiesto explore the effective use of depth information in few-shotsegmentation tasks. Experiments on Cityscapes-3i show that ourmethod achieves excellent results with visual and complementarygeometric cues from only a few labeled examples.
- Published
- 2021
35. Exploring Deep Registration Latent Spaces
- Author
-
Théophraste Henry, Enzo Battistella, Marie-Pierre Revel, Maria Vakalopoulou, Marvin Lerousseau, Amaury Leroy, Nikos Paragios, Guillaume Chassagnon, Théo Estienne, Stergios Christodoulidis, Eric Deutsch, ESTIENNE, Théo, Mathématiques et Informatique pour la Complexité et les Systèmes (MICS), CentraleSupélec-Université Paris-Saclay, Radiothérapie Moléculaire et Innovation Thérapeutique (RaMo-IT), Institut Gustave Roussy (IGR)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Paris-Saclay, OPtimisation Imagerie et Santé (OPIS), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de vision numérique (CVN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-CentraleSupélec-Université Paris-Saclay, TheraPanacea [Paris], Service de Radiologie [CHU Cochin], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Hôpital Cochin [AP-HP], and Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Computer Science - Machine Learning ,Computer science ,Computer Science - Artificial Intelligence ,Physics::Medical Physics ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-IM] Computer Science [cs]/Medical Imaging ,[SDV.CAN]Life Sciences [q-bio]/Cancer ,Space (mathematics) ,Field (computer science) ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Deep Learning-based Medical Image Registration ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[SDV.CAN] Life Sciences [q-bio]/Cancer ,Simple (abstract algebra) ,Encoding (memory) ,Deformable Registration ,[INFO.INFO-IM]Computer Science [cs]/Medical Imaging ,Interpretability ,Basis (linear algebra) ,business.industry ,Deep learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Explainability ,Deep neural networks ,Artificial intelligence ,business - Abstract
Explainability of deep neural networks is one of the most challenging and interesting problems in the field. In this study, we investigate the topic focusing on the interpretability of deep learning-based registration methods. In particular, with the appropriate model architecture and using a simple linear projection, we decompose the encoding space, generating a new basis, and we empirically show that this basis captures various decomposed anatomically aware geometrical transformations. We perform experiments using two different datasets focusing on lungs and hippocampus MRI. We show that such an approach can decompose the highly convoluted latent spaces of registration pipelines in an orthogonal space with several interesting properties. We hope that this work could shed some light on a better understanding of deep learning-based registration methods., Comment: 13 pages, 5 figures + 3 figures in supplementary materials Accepted to DART 2021 workshop
- Published
- 2021
36. Random walkers on morphological trees: A segmentation paradigm
- Author
-
Barbara Romaniuk, Nicolas Passat, Stephanie Servagi-Vernat, Francisco Javier Alvarez Padilla, Dimitri Papathanassiou, Benoît Naegel, D. Morland, Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA), Universidad de Guadalajara, Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie (ICube), Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)-École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Réseau nanophotonique et optique, Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Matériaux et nanosciences d'Alsace (FMNGE), Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Département de Radiothérapie, Institut Jean Godinot, Département de Médecine Nucléaire, Institut Jean Godinot, ANR-11-INBS-0006,FLI,France Life Imaging(2011), École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Université de Strasbourg (UNISTRA)-Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Les Hôpitaux Universitaires de Strasbourg (HUS)-Centre National de la Recherche Scientifique (CNRS)-Matériaux et Nanosciences Grand-Est (MNGE), Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Réseau nanophotonique et optique, Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS), Passat, Nicolas, and Infrastructures - France Life Imaging - - FLI2011 - ANR-11-INBS-0006 - INBS - VALID
- Subjects
Maximally stable extremal regions ,Watershed ,Tree of shapes ,Computer science ,PET/CT ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Mathematical morphology ,Region-based attributes ,01 natural sciences ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Segmentation ,Component-tree ,Random walker algorithm ,Artificial Intelligence ,Cut ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,010306 general physics ,Multimodality ,Pixel ,business.industry ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Directed graph ,Image segmentation ,Graph ,Vertex (geometry) ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,Computer Science::Computer Vision and Pattern Recognition ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Signal Processing ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software - Abstract
International audience; The problem of image segmentation is often considered in the framework of graphs.In this context, two main paradigms exist: in the first, the vertices of a non-directed graph represent the pixels (leading e.g. to the watershed, the random walker or the graph cut approaches); in the second, the vertices of a directed graph represent the connected regions, leading to the so-called morphological trees (e.g. the component-trees or the trees of shapes).Various approaches have been proposed for carrying out segmentation from images modeled by such morphological trees, by computing cuts of these trees or by selecting relevant nodes from descriptive attributes.In this article, we propose a new way of carrying out segmentation from morphological trees.Our approach is dedicated to take advantage of the morphological tree of an image, enriched by multiple attributes in each node, by using maximally stable extremal regions and random walker paradigms for defining an optimal cut leading to a final segmentation.Experiments, carried out on multimodal medical images emphasize the potential relevance of this approach.
- Published
- 2021
37. A CNN cloud detector for panchromatic satellite images
- Author
-
Charles Hessel, Jérémy Anger, R. Grompone, Jean-Michel Morel, Mariano Rodríguez, Gabriele Facciolo, C. de Franchis, Rodríguez, Mariano, Centre de Mathématiques et de Leurs Applications (CMLA), and École normale supérieure - Cachan (ENS Cachan)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
010504 meteorology & atmospheric sciences ,Exploit ,Computer science ,single-band ,0211 other engineering and technologies ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Cloud computing ,02 engineering and technology ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,01 natural sciences ,Convolutional neural network ,Cloud detector ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Computer vision ,Time series ,Astrophysics::Galaxy Astrophysics ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences ,business.industry ,Detector ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Spectral bands ,Panchromatic film ,panchromatic ,Satellite ,Artificial intelligence ,business ,CNN ,satellite images - Abstract
International audience; Cloud detection is a crucial step for automatic satellite image analysis. Some cloud detection methods exploit specially designed spectral bands, other base the detection on time series, or on the inter-band delay in push-broom satellites. Nevertheless many use cases occur where these methods do not apply. This paper describes a convolutional neural network for cloud detection in panchromatic and single-frame images. Only a per-image annotation is required, indicating which images contain clouds and which are cloud-free. Our experiments show that, in spite of using less information, the proposed method produces competitive results.
- Published
- 2021
38. Multiscale Attention-Based Prototypical Network For Few-Shot Semantic Segmentation
- Author
-
Yifei Zhang, Désiré Sidibé, Fabrice Meriaudeau, Olivier Morel, Sidibé, Désiré, Equipe VIBOT - VIsion pour la roBOTique [ImViA EA7535 - ERL CNRS 6000] (VIBOT), Centre National de la Recherche Scientifique (CNRS)-Imagerie et Vision Artificielle [Dijon] (ImViA), Université de Bourgogne (UB)-Université de Bourgogne (UB), Informatique, BioInformatique, Systèmes Complexes (IBISC), and Université d'Évry-Val-d'Essonne (UEVE)-Université Paris-Saclay
- Subjects
business.industry ,Computer science ,Deep learning ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,02 engineering and technology ,Image segmentation ,010501 environmental sciences ,Semantics ,01 natural sciences ,Image (mathematics) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Minimum bounding box ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Segmentation ,Artificial intelligence ,business ,0105 earth and related environmental sciences - Abstract
International audience; Deep learning-based image understanding techniques require a large number of labeled images for training. Few-shot semantic segmentation, on the contrary, aims at generalizing the segmentation ability of the model to new categories given only a few labeled samples. To tackle this problem, we propose a novel prototypical network (MAPnet) with multiscale feature attention. To fully exploit the representative features of target classes, we firstly extract rich contextual information of labeled support images via a multiscale feature enhancement module. The learned prototypes from support features provide further semantic guidance on the query image. Then we adaptively integrate multiple similarity-guided probability maps by attention mechanism, yielding an optimal pixel-wise prediction. Furthermore, the proposed method was validated on the PASCAL-5 i dataset in terms of 1-way N-shot evaluation. We also test the model with weak annotations, including scribble and bounding box annotations. Both the qualitative and quantitative results demonstrate the advantages of our approach over other state-of-the-art methods.
- Published
- 2021
39. Deep Learning for Semantic Segmentation
- Author
-
Alexandre Benoit, Patrick Lambert, Emna Amri, Badih Ghattas, Joris Fournel, Laboratoire d'Informatique, Systèmes, Traitement de l'Information et de la Connaissance (LISTIC), Université Savoie Mont Blanc (USMB [Université de Savoie] [Université de Chambéry]), Institut de Mathématiques de Marseille (I2M), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Benois-Pineau, Jenny, Zemmari, Akka, and Benoit, Alexandre
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing ,Computer science ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Machine learning ,computer.software_genre ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Task (project management) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[STAT.ML]Statistics [stat]/Machine Learning [stat.ML] ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,[INFO.INFO-MM] Computer Science [cs]/Multimedia [cs.MM] ,Class (computer programming) ,Machine learning / Deep learning approaches ,business.industry ,Deep learning ,Search engine indexing ,[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM] ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Image segmentation ,Object (computer science) ,[STAT.ML] Statistics [stat]/Machine Learning [stat.ML] ,semantic segmentation ,Object detection ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Segmentation is a fundamental problem but not the ultimate goal, it is a stepping stone to higher level application problems. It consists in associating each of the low-level image pixels to the class they locally represent. This task completes image analysis tasks such as visual scene classification and instance level object detection. It enables high level applications in a variety of domains, from images and video indexing to autonomous vehicle driving and medical image analysis. Recently, deep learning approaches have pushed image segmentation and object instance segmentation in a new era with impressive performance levels. However, several challenges have to be faced to train those approaches in an effective way for each of the case studies, dealing with few training samples, specific data, strong target imbalance and so on. This chapter reviews the image segmentation task and recent advanced strategies to face those potential issues in a variety of application domains. Current challenges to address are highlighted for future research directions.
- Published
- 2021
40. Suivi multi-objets basé sur des tracklets dans un réseau de caméras
- Author
-
Dorai, Yosra, STAR, ABES, Institut Pascal (IP), Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne (UCA)-Institut national polytechnique Clermont Auvergne (INP Clermont Auvergne), Université Clermont Auvergne (UCA)-Université Clermont Auvergne (UCA), Université Clermont Auvergne, Université de Sousse (Tunisie), Frédéric Chausse, and Najoua Essoukri Ben Amara
- Subjects
Artificial intelligence ,Apprentissage profond ,Suivi ,Re-identification ,Systèmes de transport intelligents ,Ré-identification ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Image processing ,Systèmes de vision ,Vision par ordinateur ,Vision systems ,Intelligent transport systems ,Apparence ,Tracklet ,Tracking ,Description globale ,Traitement d’images ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Appearance ,Deep learning ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Intelligence artificielle ,Classification ,Réseaux de neurones convolutifs ,Computer vision ,Convolutional neural networks ,Global description ,LSTM - Abstract
Today, cameras invade our life, they are more and more used in several fields and they are installed everywhere in public and private environments. Particularly, in the context of video surveillance in order to identify people or vehicles not only by a camera but by the network. Data exploitation from a camera network becomes a necessity nowadays to face security problems or even simple inspection. These vision systems, which can be based on artificial intelligence algorithms, present actual challenges. This PhD deals with multi-object tracking and focuses on re-identification in a network of cameras. The challenge is to determine the position of an object relative to the field of view of each camera. This challenge becomes particularly complex due to the change of object appearance from one camera to another, the variation of luminosity, the angle of view... The objective of this work is to propose a reliable solution to re-identify objects in a network of cameras ensuring a robustness to the different complexities of re-identification.In this context, we propose to exploit the history of the detected object and to form a set of detection nodes which we called "tracklet". This is a new approach inspired by variousworks in the state of the art.Our contributions based on tracklets proposed during this PhD covers two phases : tracking in a camera and re-identification in a network of cameras. In order to optimize the performance of tracking and reidentification algorithms, a trusted detection system is needed. In the literature, convolutional neural networks (CNNs) have been very successful in multi-object detection. We used a deep learning method to detect objects. Although this method has a high detection rate, it also has a number of false positives and false negatives especially when the test base is different from the training base. This led us to propose our original tracklet-based tracking method which allows to correct the detection defects and improve the tracking quality. In fact, our strategy is based first on the construction of tracklets based on detections by a signature comparison, then to build trajectories fromthe association of tracklets.However, these trajectories can present breaks due to non-detections. An update step allows to fill them thanks to an interpolation process that reconstitutes the non-detected objects.As for the reidentification, our contributions are manifested on the one hand in the increase of the volume of training data. In fact, the neural network that performs the reidentification of a trajectory in each camera requires, for its training, a large volume of data. It is possible that for some cameras this volume is too small, hence the need to regenerate other data. Our contribution is to generate new samples from tracklets detected in another camera by an auto-encoder (GAN). This allows us to transfer only true positives in an automatic way without verification. At the level of the detector in the re-identification, the objects are described by parts, which makes it possible to recognize them afterwards even if they do not appear completely in another camera. On the other hand, our contribution is manifested in the comparison of tracklets from the different cameras of the network. The proposed improvements have been evaluated on a public basis. The results achieved by our approaches show performances comparable to those of existing systems., Aujourd’hui, les caméras envahissent notre vie, elles sont de plus en plus utilisées dans plusieurs domaines et elles sont installées partout dans des milieux publics et privés, et plus particulièrement dans le contexte de vidéo-surveillance afin d’identifier des personnes ou des véhicules pas seulement par une caméra mais par le réseau entier. L’exploitation des données provenant d’un réseau de caméras devient une nécessité de nos jours pour faire face à des problèmes de sécurité ou même de simple contrôle. Ces systèmes de vision qui peuvent être basés sur des algorithmes d’intelligence artificielle présentent des défis d’actualité. Cette thèse s’inscrit dans le contexte du suivi multi-objets et s’intéresse à la réidentification dans un réseau de caméras. Cette dernière consiste à déterminer la position d’un objet par rapport au champ de vue de chaque caméra. Ce défi devient particulièrement difficile face au changement de l’apparence de l’objet d’une caméra à une autre, la variation de luminosité, l’angle de vue... L’objectif de cette thèse est de proposer une solution fiable afin de réidentifier des objets dans un réseau de caméras assurant une robustesse aux différentes complexités de la réidentification.Dans ce cadre, nous proposons d’exploiter l’historique de l’objet détecté et former un ensemble de noeuds de détection que nous avons appelé « tracklet ». Il s’agit d’une nouvelle approche inspirée des différents travaux de l’état d’art. Nos contributions basées sur des tracklets proposées durant cette thèse touche deux phases : le suivi dans une caméra et la réidentification dans un réseau de caméras. Afin d’optimiser les performances des algorithmes de suivi et de réidentification, il est nécessaire de disposer d’un système de détection fiable. Dans la littérature, les réseaux de neurones convolutifs (CNN) ont eu beaucoup de succès dans la détection multi-objets. Nous avons utilisé une méthode d’apprentissage profond afin de détecter des objets. Bien que cette méthode présente un taux de détection élevé, elle présente aussi un nombre de faux positifs et des faux négatifs surtout lorsque la base de test est différente de la base d’apprentissage. Ce qui nous amené à proposer notre originalité de suivi à base de tracklet qui permet de corriger les défauts de détection et améliorer la qualité de suivi. En fait, notre stratégie se base d’abord sur la construction des tracklets à bases des détections par une comparaison de signature, puis de construire des trajectoires à partir de l’association des tracklets. Cependant, ces trajectoires peuvent présenter des coupures dues à la non-détection. Une étape de mise à jour permet de les combler grâce à un processus d’interpolation qui reconstitue les objets non détectés.Quant à la réidentification, nos contributions se manifestent d’une part dans l’augmentation du volume des données d’apprentissage. En fait, le réseau de neurones qui effectue la réidentification d’une trajectoire dans chaque caméra nécessite, pour son entraînement, un volume important de données. Il se peut que pour certaines caméras ce volume soit trop faible d’où la nécessité de régénérer d’autres données. Notre contribution est de générer des nouveaux échantillons à partir des tracklets détectées dans une autre caméra par un auto-encodeur (GAN). Ce qui nous permet de ne transférer que des vrais positifs d’une façon automatique sans vérification. Au niveau du détecteur dans la ré-identification, les objets sont décrits par parties, ce qui permet de les reconnaître par la suite même s’ils n’apparaissent pas complètement dans une autre caméra. D’autre part, notre contribution se manifeste dans la comparaison des tracklets issues des différentes caméras du réseau. Les améliorations proposées ont été évaluées sur des bases publiques et privées. Les résultats atteints par nos approches montrent des performances comparables à celles des systèmes existants.
- Published
- 2021
41. Deep convolutional neural networks for scene understanding and motion planning for self-driving vehicles
- Author
-
Loukkal, Abdelhak, STAR, ABES, Heuristique et Diagnostic des Systèmes Complexes [Compiègne] (Heudiasyc), Université de Technologie de Compiègne (UTC)-Centre National de la Recherche Scientifique (CNRS), Université de Technologie de Compiègne, and Yves Grandvalet
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Artificial intelligence ,Lidar ,[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO] ,Autonomous vehicles ,Algorithmes de navigation ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition systems ,Robotics ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Multisensor data fusion ,Robot vision ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,Perception ,Computer vision ,Neural networks - Abstract
During this thesis, some perception approaches for self-driving vehicles were developed using de convolutional neural networks applied to monocular camera images and High-Definition map (HD-ma rasterized images. We focused on camera-only solutions instead of leveraging sensor fusion with rang sensors because cameras are the most cost-effective and discrete sensors. The objective was also to show th camera-based approaches can perform at par with LiDAR-based solutions on certain 3D vision tasks. Rea world data was used for training and evaluation of the developed approaches but simulation was als leveraged when annotated data was lacking or for safety reasons when evaluating driving capabilities. Cameras provide visual information in a projective space where the perspective effect does not preserve th distances homogeneity. Scene understanding tasks such as semantic segmentation are then often operated i the camera-view space and then projected to 3D using a precise depth sensor such as a LiDAR. Having thi scene understanding in the 3D space is useful because the vehicles evolve in the 3D world and the navigatio algorithms reason in this space. Our focus was then to leverage the geometric knowledge about the camer parameters and its position in the 3D world to develop an approach that allows scene understanding in the 3D space using only a monocular image as input. Neural networks have also proven to be useful for more than just perception and are more and more used fo the navigation and planning tasks that build on the perception outputs. Being able to output 3D scen understanding information from a monocular camera has also allowed us to explore the possibility of havin an end-to-end holistic neural network that takes a camera image as input, extracts intermediate semantic information in the 3D space and then lans the vehicle's trajectory., Au cours de cette thèse, des approches de perception pour les véhicules autonomes ont été développées en utilisant des réseaux de neurones convolutifs profonds appliqués à des images de caméra monoculaire et à des images rastérisées de cartes haute définition (HD-map). Nous nous sommes concentrés sur des solutions utilisant uniquement la caméra au lieu de tirer parti de la fusion de capteurs avec des capteurs de distance, car les caméras sont les capteurs les plus rentables et les plus discrets. L'objectif était également de montrer que les approches basées sur des caméras peuvent fonctionner au même niveau que les solutions basées sur LiDAR sur certaines tâches de vision 31). Des données du monde réel ont été utilisées pour l'entraînement et l'évaluation des approches développées, mais la simulation a également été exploitée lorsque les données annotées faisaient défaut ou pour des raisons de sécurité lors de l'évaluation des capacités de conduite. Les caméras fournissent des informations visuelles dans un espace projectif où l'effet de perspective ne préserve pas l'homogénéité des distances. Les tâches de compréhension de scène telles que la segmentation sémantique sont ensuite souvent effectuées dans l'espace de vue de la caméra, puis projetées en 3D à l'aide d'un capteur de profondeur précis tel qu'un LiDAR. Avoir cette compréhension de scène dans l'espace 31) est utile car les véhicules évoluent dans le monde 3D et les algorithmes de navigation raisonnent dans cet espace. Notre objectif était alors d'exploiter les connaissances géométriques sur les paramètres de la caméra et sa position dans le monde 3D pour développer une approche qui perm la compréhension de la scène dans l'espace 31) en utilisant uniquement une image monoculaire comme entrée. Les réseaux de neurones se sont également avérés utiles pour plus que la simple perception et sont de plus en plus utilisés pour les tâches de navigation et de planification qui s'appuient sur les sorties de perception. Etre capable de produire des informations de compréhension de scène 31) à partir d'une caméra monoculaire nous a également permis d'explorer la possibilité d'avoir un réseau neuronal holistique de bout en bout qui prend une image de caméra en entrée, extrait des informations sémantiques intermédiaires dans l'espace 3D, puis planifie la trajectoire du véhicule.
- Published
- 2021
42. Multilevel Survival Modeling with Structured Penalties for Disease Prediction from Imaging Genetics data
- Author
-
Olivier Colliot, Pascal Lu, Algorithms, models and methods for images and signals of the human brain (ARAMIS), Sorbonne Université (SU)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), The research leading to these results has received funding from the French government under management of Agence Nationale de la Recherche as part of the 'Investissements d’avenir' program, reference ANR-19-P3IA-0001(PRAIRIE 3IA Institute) and reference ANR-10-IAIHU-06 (Agence Nationalede la Recherche-10-IA Institut Hospitalo-Universitaire-6) and from the INRIAProject Lab Program (project Neuromarkers)., ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), Colliot, Olivier, PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], and Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Cox Proportional Hazards ,Computer science ,Imaging genetics ,[SDV.IB.IMA]Life Sciences [q-bio]/Bioengineering/Imaging ,[SDV.NEU.NB]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,[INFO.INFO-IM] Computer Science [cs]/Medical Imaging ,02 engineering and technology ,computer.software_genre ,030218 nuclear medicine & medical imaging ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,0302 clinical medicine ,Health Information Management ,0202 electrical engineering, electronic engineering, information engineering ,Additive model ,ComputingMilieux_MISCELLANEOUS ,Multilevel model ,Brain ,Magnetic Resonance Imaging ,3. Good health ,Computer Science Applications ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,020201 artificial intelligence & image processing ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,Algorithms ,Biotechnology ,Survival model ,[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Machine learning ,03 medical and health sciences ,Neuroimaging ,Alzheimer Disease ,[INFO.INFO-IM]Computer Science [cs]/Medical Imaging ,Humans ,Cognitive Dysfunction ,Alzheimer’s Disease ,Electrical and Electronic Engineering ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing ,Modality (human–computer interaction) ,Modalities ,business.industry ,Proportional hazards model ,[SDV.NEU.NB] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,Group Lasso penalty ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[SDV.IB.IMA] Life Sciences [q-bio]/Bioengineering/Imaging ,Proximal Gradient Methods ,Artificial intelligence ,business ,computer - Abstract
International audience; This paper introduces a framework for disease prediction from multimodal genetic and imaging data. We propose a multilevel survival model which allows predicting the time of occurrence of a future disease state in patients initially exhibitingmild symptoms. This new multilevel setting allows modeling the interactions between genetic and imaging variables. This is incontrast with classical additive models which treat all modalities in the same manner and can result in undesirable elimination of specific modalities when their contributions are unbalanced. Moreover, the use of a survival model allows overcoming the limitations of previous approaches based on classification which consider a fixed time frame. Furthermore, we introduce specific penalties taking into account the structure of the different types of data, such as a group lasso penalty over the genetic modality a a ℓ2-penalty over the imaging modality. Finally, we propose a fast optimization algorithm, based on a proximal gradient method. The approach was applied to the prediction of Alzheimer’s disease (AD) among patients with mild cognitive impairment(MCI) based on genetic (single nucleotide polymorphisms - SNP) and imaging (anatomical MRI measures) data from the ADNI database. The experiments demonstrate the effectiveness of the method for predicting the time of conversion to AD. It revealed how genetic variants and brain imaging alterations interact in theprediction of future disease status. The approach is generic and could potentially be useful for the prediction of other diseases
- Published
- 2021
43. Similarity Metric Learning
- Author
-
Christophe Garcia, Stefan Duffner, Atilla Baskurt, Khalid Idrissi, Extraction de Caractéristiques et Identification (imagine), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), and Duffner, Stefan
- Subjects
Computer Science::Machine Learning ,0209 industrial biotechnology ,Computer science ,[INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,02 engineering and technology ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,Machine learning ,computer.software_genre ,020901 industrial engineering & automation ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Semantic similarity ,Similarity (network science) ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,0202 electrical engineering, electronic engineering, information engineering ,Image retrieval ,Artificial neural network ,business.industry ,Deep learning ,Supervised learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Metric (mathematics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Tuple ,business ,computer - Abstract
International audience; Similarity metric learning models the general semantic similarities and distances between objects and classes of objects (e.g. persons) in order to recognise them. Different strategies and models based on Deep Learning exist and generally consist in learning a non-linear projection into a lower dimensional vector space where the semantic similarity between instances can be easily measured with a standard distance. As opposed to supervised learning, one does not train the model to predict the class labels, and the actual labels may not even be used or not known in advance. Machine learning-based similarity metric learning approaches rather operate in a weakly supervised way. That is, the training target (loss) is defined on the relationship between several instances, i.e. similar or different pairs, triplets or tuples. This learnt distance can then be applied, for example, to two new, unseen examples of unknown classes in order to determine if they belong to the same class or if they are similar. There exist numerous applications for metric learning such as face or speaker verification, image retrieval, human activity recognition or person re-identification in images. In this chapter, an overview of the principle methods and models used for similarity metric learning with neural networks is given, describing the most common architectures, loss functions and training algorithms.
- Published
- 2021
44. Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation
- Author
-
Hugo Germain, Vincent Lepetit, Guillaume Bourmaud, Laboratoire d'Informatique Gaspard-Monge (LIGM), École des Ponts ParisTech (ENPC)-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel, École des Ponts ParisTech (ENPC), and Lepetit, Vincent
- Subjects
FOS: Computer and information sciences ,Source code ,Computer science ,media_common.quotation_subject ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Pose ,media_common ,business.industry ,Deep learning ,Reprojection error ,Estimator ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Memory management ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Feature learning - Abstract
International audience; Absolute camera pose estimation is usually addressed by sequentially solving two distinct subproblems: First a feature matching problem that seeks to establish putative 2D-3D correspondences, and then a Perspective-n-Point problem that minimizes, w.r.t. the camera pose, the sum of socalled Reprojection Errors (RE). We argue that generating putative 2D-3D correspondences 1) leads to an important loss of information that needs to be compensated as far as possible, within RE, through the choice of a robust loss and the tuning of its hyperparameters and 2) may lead to an RE that conveys erroneous data to the pose estimator. In this paper, we introduce the Neural Reprojection Error (NRE) as a substitute for RE. NRE allows to rethink the camera pose estimation problem by merging it with the feature learning problem, hence leveraging richer information than 2D-3D correspondences and eliminating the need for choosing a robust loss and its hyperparameters. Thus NRE can be used as training loss to learn image descriptors tailored for pose estimation. We also propose a coarse-to-fine optimization method able to very efficiently minimize a sum of NRE terms w.r.t. the camera pose. We experimentally demonstrate that NRE is a good substitute for RE as it significantly improves both the robustness and the accuracy of the camera pose estimate while being computationally and memory highly efficient. From a broader point of view, we believe this new way of merging deep learning and 3D geometry may be useful in other computer vision applications. Source code and model weights will be made available at hugogermain.com/nre.
- Published
- 2021
- Full Text
- View/download PDF
45. Sparse LiDAR and Stereo Fusion (SLS-Fusion) for Depth Estimation and 3D Object Detection
- Author
-
Pierre Duthon, Sergio A. Velastin, Louahdi Khoudour, Nguyen Anh Minh Mai, Alain Crouzil, CROUZIL, Alain, Institution of Engineering and Technology (IET), CoMputational imagINg anD viSion (IRIT-MINDS), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Centre d'Etudes et d'Expertise sur les Risques, l'Environnement, la Mobilité et l'Aménagement - Equipe-projet STI (Cerema Equipe-projet STI), Centre d'Etudes et d'Expertise sur les Risques, l'Environnement, la Mobilité et l'Aménagement (Cerema), School of Electronic Engineering and Computer Science (EECS), Queen Mary University of London (QMUL), and Carlos III University of Madrid
- Subjects
Autonomous vehicle ,Computer science ,LiDAR stereo fusion ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Depth completion ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,11. Sustainability ,Computer vision ,Informática ,Artificial neural network ,business.industry ,Pseudo lidar ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,2D to 3D conversion ,Pseudo LiDAR ,Object detection ,3D object detection ,Lidar ,Fuse (electrical) ,RGB color model ,Lidar stereo fusion ,Artificial intelligence ,business ,Focus (optics) ,Stereo camera - Abstract
Procedings in: 11th International Conference on Pattern Recognition Systems (ICPRS-21), conference paper, 17-19 mar, 2021, Universidad de Talca, Curicó, Chile. The ability to accurately detect and localize objects is recognized as being the most important for the perception of self-driving cars. From 2D to 3D object detection, the most difficult is to determine the distance from the ego-vehicle to objects. Expensive technology like LiDAR can provide a precise and accurate depth information, so most studies have tended to focus on this sensor showing a performance gap between LiDAR-based methods and camera-based methods. Although many authors have investigated how to fuse LiDAR with RGB cameras, as far as we know there are no studies to fuse LiDAR and stereo in a deep neural network for the 3D object detection task. This paper presents SLS-Fusion, a new approach to fuse data from 4-beam LiDAR and a stereo camera via a neural network for depth estimation to achieve better dense depth maps and thereby improves 3D object detection performance. Since 4-beam LiDAR is cheaper than the well-known 64-beam LiDAR, this approach is also classified as a low-cost sensors-based method. Through evaluation on the KITTI benchmark, it is shown that the proposed method significantly improves depth estimation performance compared to a baseline method. Also when applying it to 3D object detection, a new state of the art on low-cost sensor based method is achieved.
- Published
- 2021
46. Radiological classification of dementia from anatomical MRI assisted by machine learning-derived maps
- Author
-
Béatrice Marro, Marc Teichmann, Paul Beunon, Olivier Colliot, Jorge Samper-González, Pierre Chagué, Alexandre Morin, Didier Dormont, Marion Houot, Sarah Fadili, Stéphane Epelbaum, Bruno Dubois, Lionel Arrivé, Colliot, Olivier, PaRis Artificial Intelligence Research InstitutE - - PRAIRIE2019 - ANR-19-P3IA-0001 - P3IA - VALID, Institut de Neurosciences Translationnelles de Paris - - IHU-A-ICM2010 - ANR-10-IAHU-0006 - IAHU - VALID, Data-driven models for Progression Of Neurological Disease - EuroPOND - - H2020 Pilier Societal Challenges2016-01-01 - 2019-12-31 - 666992 - VALID, Algorithms, models and methods for images and signals of the human brain (ARAMIS), Sorbonne Université (SU)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Service de Radiologie [CHU Saint-Antoine], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-CHU Saint-Antoine [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU), Service de Neurologie [CHU Pitié-Salpêtrière], IFR70-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU), Institut du Cerveau = Paris Brain Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Institut de la Mémoire et de la Maladie d'Alzheimer [CHU Pitié-Salpétriêre] (IM2A), CHU Pitié-Salpêtrière [AP-HP], Service de Neuroradiologie [CHU Pitié-Salpêtrière], FRONTLAB: Fonctions et dysfonctions de systèmes frontaux [ICM Paris] (FRONTlab), The research leading to these results has received funding from the French government under management of Agence Nationale de la Recherche as part of the 'Investissements d'avenir' program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute)and reference ANR-10-IAIHU-06 (Agence Nationale de la Recherche-10-IA Institut Hospitalo-Universitaire-6),from the European Union H2020 program (project EuroPOND, grant number 666992), and from the Abeona Foundation (project Brain@Scale)., ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), ANR-10-IAHU-0006,IHU-A-ICM,Institut de Neurosciences Translationnelles de Paris(2010), European Project: 666992,H2020 Pilier Societal Challenges,H2020-PHC-2015-two-stage,EuroPOND(2016), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Service de neurologie 1 [CHU Pitié-Salpétrière], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-CHU Pitié-Salpêtrière [AP-HP], Institut du Cerveau et de la Moëlle Epinière = Brain and Spine Institute (ICM), Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Institut de la Mémoire et de la Maladie d'Alzheimer [Paris] (IM2A), Sorbonne Université (SU), FRONTlab - Systèmes frontaux : fonctions et dysfonctions (FRONTlab), CHU Saint-Antoine [AP-HP], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU), Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP], Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), and Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
medicine.medical_specialty ,Computer science ,diagnosis ,[SDV.IB.IMA]Life Sciences [q-bio]/Bioengineering/Imaging ,[SDV.NEU.NB]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,education ,[INFO.INFO-IM] Computer Science [cs]/Medical Imaging ,Diagnostic accuracy ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Alzheimer Disease ,Machine learning ,medicine ,[INFO.INFO-IM]Computer Science [cs]/Medical Imaging ,Humans ,Dementia ,Radiology, Nuclear Medicine and imaging ,Medical physics ,[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing ,Radiological and Ultrasound Technology ,medicine.diagnostic_test ,[SDV.NEU.NB] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Neurobiology ,Brain ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Magnetic resonance imaging ,Alzheimer's disease ,medicine.disease ,Clinical routine ,artificial intelligence ,Magnetic Resonance Imaging ,Support vector machine ,anatomical MRI ,Workflow ,[SDV.IB.IMA] Life Sciences [q-bio]/Bioengineering/Imaging ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,Frontotemporal Dementia ,Radiological weapon ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Neurology (clinical) ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,030217 neurology & neurosurgery ,Frontotemporal dementia - Abstract
Background and purpose Many artificial intelligence tools are currently being developed to assist diagnosis of dementia from magnetic resonance imaging (MRI). However, these tools have so far been difficult to integrate in the clinical routine workflow. In this work, we propose a new simple way to use them and assess their utility for improving diagnostic accuracy. Materials and methods We studied 34 patients with early-onset Alzheimer's disease (EOAD), 49 with late-onset AD (LOAD), 39 with frontotemporal dementia (FTD) and 24 with depression from the pre-existing cohort CLIN-AD. Support vector machine (SVM) automatic classifiers using 3D T1 MRI were trained to distinguish: LOAD vs. Depression, FTD vs. LOAD, EOAD vs. Depression, EOAD vs. FTD. We extracted SVM weight maps, which are tridimensional representations of discriminant atrophy patterns used by the classifier to take its decisions and we printed posters of these maps. Four radiologists (2 senior neuroradiologists and 2 unspecialized junior radiologists) performed a visual classification of the 4 diagnostic pairs using 3D T1 MRI. Classifications were performed twice: first with standard radiological reading and then using SVM weight maps as a guide. Results Diagnostic performance was significantly improved by the use of the weight maps for the two junior radiologists in the case of FTD vs. EOAD. Improvement was over 10 points of diagnostic accuracy. Conclusion This tool can improve the diagnostic accuracy of junior radiologists and could be integrated in the clinical routine workflow.
- Published
- 2021
47. Learning topology: bridging computational topology and machine learning
- Author
-
Davide Moroni, Maria Antonietta Pascali, Istituto di Scienza e Tecnologie dell'Informazione 'A. Faedo', and Moroni, Davide
- Subjects
Computer science ,data analysis ,Data analysis ,02 engineering and technology ,0102 computer and information sciences ,Machine learning ,computer.software_genre ,Topology ,01 natural sciences ,Computational topology ,010305 fluids & plasmas ,symbols.namesake ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,image and shape analysis ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Persistent homology ,0101 mathematics ,Topology (chemistry) ,business.industry ,Deep learning ,010102 general mathematics ,deep learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Compendium ,010201 computation theory & mathematics ,Pattern recognition (psychology) ,Euler's formula ,symbols ,020201 artificial intelligence & image processing ,Topological data analysis ,persistent homology ,Artificial intelligence ,Computer Vision and Pattern Recognition ,business ,Image and shape analysis ,·machine learning ,computer - Abstract
The attached file is the postprint version of the published paper.; International audience; Topology is a classical branch of mathematics, born essentially from Euler's studies in the XVII century, which deals with the abstract notion of shape and geometry. Last decades were characterized by a renewed interest in topology and topology-based tools, due to the birth of computational topology and Topological Data Analysis (TDA). A large and novel family of methods and algorithms computing topological features and descriptors (e.g. persistent homology) have proved to be effective tools for the analysis of graphs, 3d objects, 2D images, and even heterogeneous datasets. This survey is intended to be a concise but complete compendium that, offering the essential basic references, allows you to orient yourself among the recent advances in TDA and its applications, with an eye to those related to machine learning and deep learning.
- Published
- 2021
48. Supervised quality evaluation of binary partition trees for object segmentation
- Author
-
Jimmy Francky Randrianasoa, Eric Desjardin, Nicolas Passat, Camille Kurtz, Nathalie Bednarek, François Rousseau, Pierre Gançarski, Pierre Cettour-Janet, Laboratoire Hubert Curien (LHC), Institut d'Optique Graduate School (IOGS)-Université Jean Monnet - Saint-Étienne (UJM)-Centre National de la Recherche Scientifique (CNRS), Centre de Recherche en Sciences et Technologies de l'Information et de la Communication - EA 3804 (CRESTIC), Université de Reims Champagne-Ardenne (URCA), Laboratoire d'Informatique Paris Descartes (LIPADE - EA 2517), Université Paris Descartes - Paris 5 (UPD5), Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie (ICube), École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Université de Strasbourg (UNISTRA)-Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Les Hôpitaux Universitaires de Strasbourg (HUS)-Centre National de la Recherche Scientifique (CNRS)-Matériaux et Nanosciences Grand-Est (MNGE), Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Réseau nanophotonique et optique, Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS), Service de médecine néonatale et réanimation pédiatrique, CHU de Reims, Laboratoire de Traitement de l'Information Medicale (LaTIM), Université de Brest (UBO)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre Hospitalier Régional Universitaire de Brest (CHRU Brest)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut Brestois Santé Agro Matière (IBSAM), Université de Brest (UBO), Département lmage et Traitement Information (IMT Atlantique - ITI), IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), ANR-15-CE23-0009,MAIA,Analyse multiphysiques fondée sur l'imagerie pour la compréhension du développement cérébral des prématurés(2015), ANR-17-CE23-0015,TIMES,Exploitation de masses de données hétérogènes à haute fréquence temporelle pour l'analyse des changements environnementaux(2017), ANR-18-CE23-0025,HIATUS,Images aériennes historiques pour la caractérisation des transformations des territoires(2018), Laboratoire Hubert Curien [Saint Etienne] (LHC), Université Jean Monnet [Saint-Étienne] (UJM)-Centre National de la Recherche Scientifique (CNRS)-Institut d'Optique Graduate School (IOGS), Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)-École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Réseau nanophotonique et optique, Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Matériaux et nanosciences d'Alsace (FMNGE), Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Université de Brest (UBO)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre Hospitalier Régional Universitaire de Brest (CHRU Brest)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Passat, Nicolas, Analyse multiphysiques fondée sur l'imagerie pour la compréhension du développement cérébral des prématurés - - MAIA2015 - ANR-15-CE23-0009 - AAPG2015 - VALID, Exploitation de masses de données hétérogènes à haute fréquence temporelle pour l'analyse des changements environnementaux - - TIMES2017 - ANR-17-CE23-0015 - AAPG2017 - VALID, and APPEL À PROJETS GÉNÉRIQUE 2018 - Images aériennes historiques pour la caractérisation des transformations des territoires - - HIATUS2018 - ANR-18-CE23-0025 - AAPG2018 - VALID
- Subjects
Computer science ,Supervised quality evaluation ,02 engineering and technology ,Mathematical morphology ,01 natural sciences ,Hierarchical image model ,Set (abstract data type) ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Artificial Intelligence ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,010306 general physics ,Representation (mathematics) ,business.industry ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Pattern recognition ,Binary partition tree ,Object (computer science) ,Tree (data structure) ,[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV] ,[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV] ,Signal Processing ,Metric (mathematics) ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Object segmentation - Abstract
International audience; The binary partition tree (BPT) allows for the hierarchical representation of images in a multiscale way, by providing a tree of nodes corresponding to image regions. In particular, cuts of a BPT can be interpreted as segmentations of the associated image. Building the BPT of an image then constitutes a relevant preliminary step for optimization-based segmentation methods. A wide literature has been devoted to the construction of BPTs, and their involvement in such segmentation tasks. Comparatively, there exist few works dedicated to evaluate the quality of BPTs, i.e. their ability to allow further segmentation methods to compute good results. We propose such a framework for evaluating the quality of a BPT with respect to the object segmentation problem, i.e. the segmentation of one or several objects from an image. This framework is supervised, since the notion of segmentation quality is not only depending on the application but also on the user's objectives, expressed via the chosen ground-truth and quality metric. We develop two sides within this framework. First, we propose an intrinsic quality analysis, that relies on the structural coherence of the BPT with respect to ground-truth. More precisely, we evaluate to what extent the BPT structure is well-matching such examples, in a set / combinatorial fashion. Second, we propose an extrinsic analysis, by allowing the user to assess the quality of a BPT based on chosen metrics that correspond to the desired properties of the subsequent segmentation. In particular, we evaluate to what extent a BPT can provide good results with respect to such metrics whereas handling the trade-off with the cardinality of the cuts.
- Published
- 2021
49. How Transferable are Reasoning Patterns in VQA?
- Author
-
Grigory Antipov, Theo Jaunet, Moez Baccouche, Romain Vuillemot, Corentin Kervadec, Christian Wolf, Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2), Orange Labs, 35512 Cesson-Sévigné, France, Orange Labs R&D [Rennes], France Télécom-France Télécom, Situated Interaction, Collaboration, Adaptation and Learning (SICAL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Extraction de Caractéristiques et Identification (imagine), and Kervadec, Corentin
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Machine learning ,computer.software_genre ,Oracle ,Data modeling ,Data visualization ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Deep Learning ,0202 electrical engineering, electronic engineering, information engineering ,Question answering ,Visual Reasoning ,Transformer (machine learning model) ,business.industry ,Deep learning ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,020207 software engineering ,Visual reasoning ,Visualization ,Artificial intelligence ,business ,computer ,Visual Question Answering (VQA) - Abstract
International audience; Since its inception, Visual Question Answering (VQA) is notoriously known as a task, where models are prone to exploit biases in datasets to find shortcuts instead of performing high-level reasoning. Classical methods address this by removing biases from training data, or adding branches to models to detect and remove biases. In this paper, we argue that uncertainty in vision is a dominating factor preventing the successful learning of reasoning in vision and language problems. We train a visual oracle and in a large scale study provide experimental evidence that it is much less prone to exploiting spurious dataset biases compared to standard models. We propose to study the attention mechanisms at work in the visual oracle and compare them with a SOTA Transformer-based model. We provide an in-depth analysis and visualizations of reasoning patterns obtained with an online visualization tool which we make publicly available (https://reasoningpatterns.github.io). We exploit these insights by transferring reasoning patterns from the oracle to a SOTA Transformer-based VQA model taking standard noisy visual inputs via fine-tuning. In experiments we report higher overall accuracy, as well as accuracy on infrequent answers for each question type, which provides evidence for improved generalization and a decrease of the dependency on dataset biases.
- Published
- 2021
- Full Text
- View/download PDF
50. Forming a sparse representation for visual place recognition using a neurorobotic approach
- Author
-
Guillaume Bresson, Olivier Romain, Sylvain Colomer, Nicolas Cuperlier, and Colomer, Sylvain
- Subjects
FOS: Computer and information sciences ,[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Pooling ,Computer Science - Computer Vision and Pattern Recognition ,[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Encoding (memory) ,0502 economics and business ,medicine ,Code (cryptography) ,FOS: Electrical engineering, electronic engineering, information engineering ,050210 logistics & transportation ,Artificial neural network ,business.industry ,05 social sciences ,Image and Video Processing (eess.IV) ,[INFO.INFO-RB] Computer Science [cs]/Robotics [cs.RO] ,[SCCO.NEUR] Cognitive science/Neuroscience ,Pattern recognition ,Sparse approximation ,Electrical Engineering and Systems Science - Image and Video Processing ,Visualization ,Visual cortex ,medicine.anatomical_structure ,Artificial intelligence ,Neural coding ,business - Abstract
This paper introduces a novel unsupervised neural network model for visual information encoding which aims to address the problem of large-scale visual localization. Inspired by the structure of the visual cortex, the model (namely HSD) alternates layers of topologic sparse coding and pooling to build a more compact code of visual information. Intended for visual place recognition (VPR) systems that use local descriptors, the impact of its integration in a bio-inpired model for self-localization (LPMP) is evaluated. Our experimental results on the KITTI dataset show that HSD improves the runtime speed of LPMP by a factor of at least 2 and its localization accuracy by 10%. A comparison with CoHog, a state-of-the-art VPR approach, showed that our method achieves slightly better results.
- Published
- 2021
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.