Author: "Sznitman, Raphael" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sznitman, Raphael"' showing total 575 results

Start Over Author "Sznitman, Raphael"

575 results on '"Sznitman, Raphael"'

1. Targeted Visual Prompting for Medical Visual Question Answering

Author: Tascon-Morales, Sergio, Márquez-Neila, Pablo, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: With growing interest in recent years, medical visual question answering (Med-VQA) has rapidly evolved, with multimodal large language models (MLLMs) emerging as an alternative to classical model architectures. Specifically, their ability to add visual information to the input of pre-trained LLMs brings new capabilities for image interpretation. However, simple visual errors cast doubt on the actual visual understanding abilities of these models. To address this, region-based questions have been proposed as a means to assess and enhance actual visual understanding through compositional evaluation. To combine these two perspectives, this paper introduces targeted visual prompting to equip MLLMs with region-based questioning capabilities. By presenting the model with both the isolated region and the region in its context in a customized visual prompt, we show the effectiveness of our method across multiple datasets while comparing it to several baseline models. Our code and data are available at https://github.com/sergiotasconmorales/locvqallm., Comment: Accepted at the MICCAI AMAI Workshop 2024
Published: 2024

2. Learning Non-Linear Invariants for Unsupervised Out-of-Distribution Detection

Author: Doorenbos, Lars, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The inability of deep learning models to handle data drawn from unseen distributions has sparked much interest in unsupervised out-of-distribution (U-OOD) detection, as it is crucial for reliable deep learning models. Despite considerable attention, theoretically-motivated approaches are few and far between, with most methods building on top of some form of heuristic. Recently, U-OOD was formalized in the context of data invariants, allowing a clearer understanding of how to characterize U-OOD, and methods leveraging affine invariants have attained state-of-the-art results on large-scale benchmarks. Nevertheless, the restriction to affine invariants hinders the expressiveness of the approach. In this work, we broaden the affine invariants formulation to a more general case and propose a framework consisting of a normalizing flow-like architecture capable of learning non-linear invariants. Our novel approach achieves state-of-the-art results on an extensive U-OOD benchmark, and we demonstrate its further applicability to tabular data. Finally, we show our method has the same desirable properties as those based on affine invariants., Comment: Accepted at ECCV 2024
Published: 2024

3. Galaxy spectroscopy without spectra: Galaxy properties from photometric images with conditional diffusion models

Author: Doorenbos, Lars, Sextl, Eva, Heng, Kevin, Cavuoti, Stefano, Brescia, Massimo, Torbaniuk, Olena, Longo, Giuseppe, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - Instrumentation and Methods for Astrophysics, Computer Science - Artificial Intelligence
Abstract: Modern spectroscopic surveys can only target a small fraction of the vast amount of photometrically cataloged sources in wide-field surveys. Here, we report the development of a generative AI method capable of predicting optical galaxy spectra from photometric broad-band images alone. This method draws from the latest advances in diffusion models in combination with contrastive networks. We pass multi-band galaxy images into the architecture to obtain optical spectra. From these, robust values for galaxy properties can be derived with any methods in the spectroscopic toolbox, such as standard population synthesis techniques and Lick indices. When trained and tested on 64x64-pixel images from the Sloan Digital Sky Survey, the global bimodality of star-forming and quiescent galaxies in photometric space is recovered, as well as a mass-metallicity relation of star-forming galaxies. The comparison between the observed and the artificially created spectra shows good agreement in overall metallicity, age, Dn4000, stellar velocity dispersion, and E(B-V) values. Photometric redshift estimates of our generative algorithm can compete with other current, specialized deep-learning techniques. Moreover, this work is the first attempt in the literature to infer velocity dispersion from photometric images. Additionally, we can predict the presence of an active galactic nucleus up to an accuracy of 82%. With our method, scientifically interesting galaxy properties, normally requiring spectroscopic inputs, can be obtained in future data sets from large-scale photometric surveys alone. The spectra prediction via AI can further assist in creating realistic mock catalogs., Comment: Code is available here: https://github.com/LarsDoorenbos/generate-spectra
Published: 2024

4. Continual Unsupervised Out-of-Distribution Detection

Author: Doorenbos, Lars, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep learning models excel when the data distribution during training aligns with testing data. Yet, their performance diminishes when faced with out-of-distribution (OOD) samples, leading to great interest in the field of OOD detection. Current approaches typically assume that OOD samples originate from an unconcentrated distribution complementary to the training distribution. While this assumption is appropriate in the traditional unsupervised OOD (U-OOD) setting, it proves inadequate when considering the place of deployment of the underlying deep learning model. To better reflect this real-world scenario, we introduce the novel setting of continual U-OOD detection. To tackle this new setting, we propose a method that starts from a U-OOD detector, which is agnostic to the OOD distribution, and slowly updates during deployment to account for the actual OOD distribution. Our method uses a new U-OOD scoring function that combines the Mahalanobis distance with a nearest-neighbor approach. Furthermore, we design a confidence-scaled few-shot OOD detector that outperforms previous methods. We show our method greatly improves upon strong baselines from related fields.
Published: 2024

5. Masked Image Modelling for retinal OCT understanding

Author: Pissas, Theodoros, Márquez-Neila, Pablo, Wolf, Sebastian, Zinkernagel, Martin, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This work explores the effectiveness of masked image modelling for learning representations of retinal OCT images. To this end, we leverage Masked Autoencoders (MAE), a simple and scalable method for self-supervised learning, to obtain a powerful and general representation for OCT images by training on 700K OCT images from 41K patients collected under real world clinical settings. We also provide the first extensive evaluation for a model of OCT on a challenging battery of 6 downstream tasks. Our model achieves strong performance when fully finetuned but can also serve as a versatile frozen feature extractor for many tasks using lightweight adapters. Furthermore, we propose an extension of the MAE pretraining to fuse OCT with an auxiliary modality, namely, IR fundus images and learn a joint model for both. We demonstrate our approach improves performance on a multimodal downstream application. Our experiments utilize most publicly available OCT datasets, thus enabling future comparisons. Our code and model weights are publicly available https://github.com/TheoPis/MIM_OCT.
Published: 2024

6. Cataract-1K: Cataract Surgery Dataset for Scene Segmentation, Phase Recognition, and Irregularity Detection

Author: Ghamsarian, Negin, El-Shabrawi, Yosuf, Nasirihaghighi, Sahar, Putzgruber-Adamitsch, Doris, Zinkernagel, Martin, Wolf, Sebastian, Schoeffmann, Klaus, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In recent years, the landscape of computer-assisted interventions and post-operative surgical video analysis has been dramatically reshaped by deep-learning techniques, resulting in significant advancements in surgeons' skills, operation room management, and overall surgical outcomes. However, the progression of deep-learning-powered surgical technologies is profoundly reliant on large-scale datasets and annotations. Particularly, surgical scene understanding and phase recognition stand as pivotal pillars within the realm of computer-assisted surgery and post-operative assessment of cataract surgery videos. In this context, we present the largest cataract surgery video dataset that addresses diverse requisites for constructing computerized surgical workflow analysis and detecting post-operative irregularities in cataract surgery. We validate the quality of annotations by benchmarking the performance of several state-of-the-art neural network architectures for phase recognition and surgical scene segmentation. Besides, we initiate the research on domain adaptation for instrument segmentation in cataract surgery by evaluating cross-domain instrument segmentation performance in cataract surgery videos. The dataset and annotations will be publicly available upon acceptance of the paper., Comment: 12 pages, 5 figures, 7 tables
Published: 2023

7. DeepPyramid+: Medical Image Segmentation using Pyramid View Fusion and Deformable Pyramid Reception

Author: Ghamsarian, Negin, Wolf, Sebastian, Zinkernagel, Martin, Schoeffmann, Klaus, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic Segmentation plays a pivotal role in many applications related to medical image and video analysis. However, designing a neural network architecture for medical image and surgical video segmentation is challenging due to the diverse features of relevant classes, including heterogeneity, deformability, transparency, blunt boundaries, and various distortions. We propose a network architecture, DeepPyramid+, which addresses diverse challenges encountered in medical image and surgical video segmentation. The proposed DeepPyramid+ incorporates two major modules, namely "Pyramid View Fusion" (PVF) and "Deformable Pyramid Reception," (DPR), to address the outlined challenges. PVF replicates a deduction process within the neural network, aligning with the human visual system, thereby enhancing the representation of relative information at each pixel position. Complementarily, DPR introduces shape- and scale-adaptive feature extraction techniques using dilated deformable convolutions, enhancing accuracy and robustness in handling heterogeneous classes and deformable shapes. Extensive experiments conducted on diverse datasets, including endometriosis videos, MRI images, OCT scans, and cataract and laparoscopy videos, demonstrate the effectiveness of DeepPyramid+ in handling various challenges such as shape and scale variation, reflection, and blur degradation. DeepPyramid+ demonstrates significant improvements in segmentation performance, achieving up to a 3.65% increase in Dice coefficient for intra-domain segmentation and up to a 17% increase in Dice coefficient for cross-domain segmentation. DeepPyramid+ consistently outperforms state-of-the-art networks across diverse modalities considering different backbone networks, showcasing its versatility., Comment: 13 pages, 3 figures
Published: 2023

8. Predicting Postoperative Intraocular Lens Dislocation in Cataract Surgery via Deep Learning

Author: Ghamsarian, Negin, Putzgruber-Adamitsch, Doris, Sarny, Stephanie, Sznitman, Raphael, Schoeffmann, Klaus, and El-Shabrawi, Yosuf
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: A critical yet unpredictable complication following cataract surgery is intraocular lens dislocation. Postoperative stability is imperative, as even a tiny decentration of multifocal lenses or inadequate alignment of the torus in toric lenses due to postoperative rotation can lead to a significant drop in visual acuity. Investigating possible intraoperative indicators that can predict post-surgical instabilities of intraocular lenses can help prevent this complication. In this paper, we develop and evaluate the first fully-automatic framework for the computation of lens unfolding delay, rotation, and instability during surgery. Adopting a combination of three types of CNNs, namely recurrent, region-based, and pixel-based, the proposed framework is employed to assess the possibility of predicting post-operative lens dislocation during cataract surgery. This is achieved via performing a large-scale study on the statistical differences between the behavior of different brands of intraocular lenses and aligning the results with expert surgeons' hypotheses and observations about the lenses. We exploit a large-scale dataset of cataract surgery videos featuring four intraocular lens brands. Experimental results confirm the reliability of the proposed framework in evaluating the lens' statistics during the surgery. The Pearson correlation and t-test results reveal significant correlations between lens unfolding delay and lens rotation and significant differences between the intra-operative rotations stability of four groups of lenses. These results suggest that the proposed framework can help surgeons select the lenses based on the patient's eye conditions and predict post-surgical lens dislocation., Comment: 12 pages, 5 figures
Published: 2023

9. Correlation-aware active learning for surgery video segmentation

Author: Wu, Fei, Marquez-Neila, Pablo, Zheng, Mingyi, Rafii-Tari, Hedyeh, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Semantic segmentation is a complex task that relies heavily on large amounts of annotated image data. However, annotating such data can be time-consuming and resource-intensive, especially in the medical domain. Active Learning (AL) is a popular approach that can help to reduce this burden by iteratively selecting images for annotation to improve the model performance. In the case of video data, it is important to consider the model uncertainty and the temporal nature of the sequences when selecting images for annotation. This work proposes a novel AL strategy for surgery video segmentation, COWAL, COrrelation-aWare Active Learning. Our approach involves projecting images into a latent space that has been fine-tuned using contrastive learning and then selecting a fixed number of representative images from local clusters of video frames. We demonstrate the effectiveness of this approach on two video datasets of surgical instruments and three real-world video datasets. The datasets and code will be made publicly available upon receiving necessary approvals., Comment: WACV 2024, 8 pages, 7 supplementary pages
Published: 2023

10. Learning Super-Resolution Ultrasound Localization Microscopy from Radio-Frequency Data

Author: Hahne, Christopher, Chabouh, Georges, Couture, Olivier, and Sznitman, Raphael
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, Physics - Medical Physics
Abstract: Ultrasound Localization Microscopy (ULM) enables imaging of vascular structures in the micrometer range by accumulating contrast agent particle locations over time. Precise and efficient target localization accuracy remains an active research topic in the ULM field to further push the boundaries of this promising medical imaging technology. Existing work incorporates Delay-And-Sum (DAS) beamforming into particle localization pipelines, which ultimately determines the ULM image resolution capability. In this paper we propose to feed unprocessed Radio-Frequency (RF) data into a super-resolution network while bypassing DAS beamforming and its limitations. To facilitate this, we demonstrate label projection and inverse point transformation between B-mode and RF coordinate space as required by our approach. We assess our method against state-of-the-art techniques based on a public dataset featuring in silico and in vivo data. Results from our RF-trained network suggest that excluding DAS beamforming offers a great potential to optimize on the ULM resolution performance., Comment: IEEE International Ultrasonics Symposium (IUS), 2023
Published: 2023

11. RF-ULM: Ultrasound Localization Microscopy Learned from Radio-Frequency Wavefronts

Author: Hahne, Christopher, Chabouh, Georges, Chavignon, Arthur, Couture, Olivier, and Sznitman, Raphael
Subjects: Computer Science - Computational Geometry, Computer Science - Computer Vision and Pattern Recognition, Physics - Medical Physics
Abstract: In Ultrasound Localization Microscopy (ULM), achieving high-resolution images relies on the precise localization of contrast agent particles across a series of beamformed frames. However, our study uncovers an enormous potential: The process of delay-and-sum beamforming leads to an irreversible reduction of Radio-Frequency (RF) channel data, while its implications for localization remain largely unexplored. The rich contextual information embedded within RF wavefronts, including their hyperbolic shape and phase, offers great promise for guiding Deep Neural Networks (DNNs) in challenging localization scenarios. To fully exploit this data, we propose to directly localize scatterers in RF channel data. Our approach involves a custom super-resolution DNN using learned feature channel shuffling, non-maximum suppression, and a semi-global convolutional block for reliable and accurate wavefront localization. Additionally, we introduce a geometric point transformation that facilitates seamless mapping to the B-mode coordinate space. To understand the impact of beamforming on ULM, we validate the effectiveness of our method by conducting an extensive comparison with State-Of-The-Art (SOTA) techniques. We present the inaugural in vivo results from a wavefront-localizing DNN, highlighting its real-world practicality. Our findings show that RF-ULM bridges the domain shift between synthetic and real datasets, offering a considerable advantage in terms of precision and complexity. To enable the broader research community to benefit from our findings, our code and the associated SOTA methods are made available at https://github.com/hahnec/rf-ulm.
Published: 2023

12. DeepPyramid+: medical image segmentation using Pyramid View Fusion and Deformable Pyramid Reception

Author: Ghamsarian, Negin, Wolf, Sebastian, Zinkernagel, Martin, Schoeffmann, Klaus, and Sznitman, Raphael
Published: 2024
Full Text: View/download PDF

13. Hyperbolic Random Forests

Author: Doorenbos, Lars, Márquez-Neila, Pablo, Sznitman, Raphael, and Mettes, Pascal
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Hyperbolic space is becoming a popular choice for representing data due to the hierarchical structure - whether implicit or explicit - of many real-world datasets. Along with it comes a need for algorithms capable of solving fundamental tasks, such as classification, in hyperbolic space. Recently, multiple papers have investigated hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and SVMs. While effective, these approaches struggle with more complex hierarchical data. We, therefore, propose to generalize the well-known random forests to hyperbolic space. We do this by redefining the notion of a split using horospheres. Since finding the globally optimal split is computationally intractable, we find candidate horospheres through a large-margin classifier. To make hyperbolic random forests work on multi-class data and imbalanced experiments, we furthermore outline a new method for combining classes based on their lowest common ancestor and a class-balanced version of the large-margin loss. Experiments on standard and new benchmarks show that our approach outperforms both conventional random forest algorithms and recent hyperbolic classifiers., Comment: Accepted at TMLR. Code available at https://github.com/LarsDoorenbos/HoroRF
Published: 2023

14. StofNet: Super-resolution Time of Flight Network

Author: Hahne, Christopher, Hayoz, Michel, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Physics - Geophysics
Abstract: Time of Flight (ToF) is a prevalent depth sensing technology in the fields of robotics, medical imaging, and non-destructive testing. Yet, ToF sensing faces challenges from complex ambient conditions making an inverse modelling from the sparse temporal information intractable. This paper highlights the potential of modern super-resolution techniques to learn varying surroundings for a reliable and accurate ToF detection. Unlike existing models, we tailor an architecture for sub-sample precise semi-global signal localization by combining super-resolution with an efficient residual contraction block to balance between fine signal details and large scale contextual information. We consolidate research on ToF by conducting a benchmark comparison against six state-of-the-art methods for which we employ two publicly available datasets. This includes the release of our SToF-Chirp dataset captured by an airborne ultrasound transducer. Results showcase the superior performance of our proposed StofNet in terms of precision, reliability and model complexity. Our code is available at https://github.com/hahnec/stofnet., Comment: pre-print
Published: 2023

15. Domain Adaptation for Medical Image Segmentation using Transformation-Invariant Self-Training

Author: Ghamsarian, Negin, Tejero, Javier Gamazo, Neila, Pablo Márquez, Wolf, Sebastian, Zinkernagel, Martin, Schoeffmann, Klaus, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Models capable of leveraging unlabelled data are crucial in overcoming large distribution gaps between the acquired datasets across different imaging devices and configurations. In this regard, self-training techniques based on pseudo-labeling have been shown to be highly effective for semi-supervised domain adaptation. However, the unreliability of pseudo labels can hinder the capability of self-training techniques to induce abstract representation from the unlabeled target dataset, especially in the case of large distribution gaps. Since the neural network performance should be invariant to image transformations, we look to this fact to identify uncertain pseudo labels. Indeed, we argue that transformation invariant detections can provide more reasonable approximations of ground truth. Accordingly, we propose a semi-supervised learning strategy for domain adaptation termed transformation-invariant self-training (TI-ST). The proposed method assesses pixel-wise pseudo-labels' reliability and filters out unreliable detections during self-training. We perform comprehensive evaluations for domain adaptation using three different modalities of medical images, two different network architectures, and several alternative state-of-the-art domain adaptation methods. Experimental results confirm the superiority of our proposed method in mitigating the lack of target domain annotation and boosting segmentation performance in the target domain., Comment: 11 pages, 5 figures, accepted at 26th international conference on Medical Image Computing & Computer Assisted Intervention (MICCAI 2023)
Published: 2023

16. A reinforcement learning approach for VQA validation: an application to diabetic macular edema grading

Author: Fountoukidou, Tatiana and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Recent advances in machine learning models have greatly increased the performance of automated methods in medical image analysis. However, the internal functioning of such models is largely hidden, which hinders their integration in clinical practice. Explainability and trust are viewed as important aspects of modern methods, for the latter's widespread use in clinical communities. As such, validation of machine learning models represents an important aspect and yet, most methods are only validated in a limited way. In this work, we focus on providing a richer and more appropriate validation approach for highly powerful Visual Question Answering (VQA) algorithms. To better understand the performance of these methods, which answer arbitrary questions related to images, this work focuses on an automatic visual Turing test (VTT). That is, we propose an automatic adaptive questioning method, that aims to expose the reasoning behavior of a VQA algorithm. Specifically, we introduce a reinforcement learning (RL) agent that observes the history of previously asked questions, and uses it to select the next question to pose. We demonstrate our approach in the context of evaluating algorithms that automatically answer questions related to diabetic macular edema (DME) grading. The experiments show that such an agent has similar behavior to a clinician, whereby asking questions that are relevant to key clinical concepts., Comment: 16 pages (+ 23 pages supplementary material)
Published: 2023
Full Text: View/download PDF

17. Localized Questions in Medical Visual Question Answering

Author: Tascon-Morales, Sergio, Márquez-Neila, Pablo, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual Question Answering (VQA) models aim to answer natural language questions about given images. Due to its ability to ask questions that differ from those used when training the model, medical VQA has received substantial attention in recent years. However, existing medical VQA models typically focus on answering questions that refer to an entire image rather than where the relevant content may be located in the image. Consequently, VQA models are limited in their interpretability power and the possibility to probe the model about specific image regions. This paper proposes a novel approach for medical VQA that addresses this limitation by developing a model that can answer questions about image regions while considering the context necessary to answer the questions. Our experimental results demonstrate the effectiveness of our proposed model, outperforming existing methods on three datasets. Our code and data are available at https://github.com/sergiotasconmorales/locvqa., Comment: Appears in Medical Image Computing and Computer Assisted Interventions (MICCAI), 2023
Published: 2023

18. Geometric Ultrasound Localization Microscopy

Author: Hahne, Christopher and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Contrast-Enhanced Ultra-Sound (CEUS) has become a viable method for non-invasive, dynamic visualization in medical diagnostics, yet Ultrasound Localization Microscopy (ULM) has enabled a revolutionary breakthrough by offering ten times higher resolution. To date, Delay-And-Sum (DAS) beamformers are used to render ULM frames, ultimately determining the image resolution capability. To take full advantage of ULM, this study questions whether beamforming is the most effective processing step for ULM, suggesting an alternative approach that relies solely on Time-Difference-of-Arrival (TDoA) information. To this end, a novel geometric framework for micro bubble localization via ellipse intersections is proposed to overcome existing beamforming limitations. We present a benchmark comparison based on a public dataset for which our geometric ULM outperforms existing baseline methods in terms of accuracy and robustness while only utilizing a portion of the available transducer data., Comment: Pre-print accepted for MICCAI 2023
Published: 2023

19. Learning How To Robustly Estimate Camera Pose in Endoscopic Videos

Author: Hayoz, Michel, Hahne, Christopher, Gallardo, Mathias, Candinas, Daniel, Kurmann, Thomas, Allan, Maximilian, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Purpose: Surgical scene understanding plays a critical role in the technology stack of tomorrow's intervention-assisting systems in endoscopic surgeries. For this, tracking the endoscope pose is a key component, but remains challenging due to illumination conditions, deforming tissues and the breathing motion of organs. Method: We propose a solution for stereo endoscopes that estimates depth and optical flow to minimize two geometric losses for camera pose estimation. Most importantly, we introduce two learned adaptive per-pixel weight mappings that balance contributions according to the input image content. To do so, we train a Deep Declarative Network to take advantage of the expressiveness of deep-learning and the robustness of a novel geometric-based optimization approach. We validate our approach on the publicly available SCARED dataset and introduce a new in-vivo dataset, StereoMIS, which includes a wider spectrum of typically observed surgical settings. Results: Our method outperforms state-of-the-art methods on average and more importantly, in difficult scenarios where tissue deformations and breathing motion are visible. We observed that our proposed weight mappings attenuate the contribution of pixels on ambiguous regions of the images, such as deforming tissues. Conclusion: We demonstrate the effectiveness of our solution to robustly estimate the camera pose in challenging endoscopic surgical scenes. Our contributions can be used to improve related tasks like simultaneous localization and mapping (SLAM) or 3D reconstruction, therefore advancing surgical scene understanding in minimally-invasive surgery., Comment: Accepted at IPCAI 2023
Published: 2023

20. Unsupervised out-of-distribution detection for safer robotically guided retinal microsurgery

Author: Jungo, Alain, Doorenbos, Lars, Da Col, Tommaso, Beelen, Maarten, Zinkernagel, Martin, Márquez-Neila, Pablo, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Purpose: A fundamental problem in designing safe machine learning systems is identifying when samples presented to a deployed model differ from those observed at training time. Detecting so-called out-of-distribution (OoD) samples is crucial in safety-critical applications such as robotically guided retinal microsurgery, where distances between the instrument and the retina are derived from sequences of 1D images that are acquired by an instrument-integrated optical coherence tomography (iiOCT) probe. Methods: This work investigates the feasibility of using an OoD detector to identify when images from the iiOCT probe are inappropriate for subsequent machine learning-based distance estimation. We show how a simple OoD detector based on the Mahalanobis distance can successfully reject corrupted samples coming from real-world ex vivo porcine eyes. Results: Our results demonstrate that the proposed approach can successfully detect OoD samples and help maintain the performance of the downstream task within reasonable levels. MahaAD outperformed a supervised approach trained on the same kind of corruptions and achieved the best performance in detecting OoD cases from a collection of iiOCT samples with real-world corruptions. Conclusion: The results indicate that detecting corrupted iiOCT data through OoD detection is feasible and does not need prior knowledge of possible corruptions. Consequently, MahaAD could aid in ensuring patient safety during robotically guided microsurgery by preventing deployed prediction models from estimating distances that put the patient at risk., Comment: Accepted at IPCAI 2023
Published: 2023
Full Text: View/download PDF

21. Full or Weak annotations? An adaptive strategy for budget-constrained annotation campaigns

Author: Tejero, Javier Gamazo, Zinkernagel, Martin S., Wolf, Sebastian, Sznitman, Raphael, and Neila, Pablo Márquez
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Annotating new datasets for machine learning tasks is tedious, time-consuming, and costly. For segmentation applications, the burden is particularly high as manual delineations of relevant image content are often extremely expensive or can only be done by experts with domain-specific knowledge. Thanks to developments in transfer learning and training with weak supervision, segmentation models can now also greatly benefit from annotations of different kinds. However, for any new domain application looking to use weak supervision, the dataset builder still needs to define a strategy to distribute full segmentation and other weak annotations. Doing so is challenging, however, as it is a priori unknown how to distribute an annotation budget for a given new dataset. To this end, we propose a novel approach to determine annotation strategies for segmentation datasets, whereby estimating what proportion of segmentation and classification annotations should be collected given a fixed budget. To do so, our method sequentially determines proportions of segmentation and classification annotations to collect for budget-fractions by modeling the expected improvement of the final segmentation model. We show in our experiments that our approach yields annotations that perform very close to the optimal for a number of different annotation budgets and datasets., Comment: CVPR23
Published: 2023

22. Logical Implications for Visual Question Answering Consistency

Author: Tascon-Morales, Sergio, Márquez-Neila, Pablo, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Despite considerable recent progress in Visual Question Answering (VQA) models, inconsistent or contradictory answers continue to cast doubt on their true reasoning capabilities. However, most proposed methods use indirect strategies or strong assumptions on pairs of questions and answers to enforce model consistency. Instead, we propose a novel strategy intended to improve model performance by directly reducing logical inconsistencies. To do this, we introduce a new consistency loss term that can be used by a wide range of the VQA models and which relies on knowing the logical relation between pairs of questions and answers. While such information is typically not available in VQA datasets, we propose to infer these logical relations using a dedicated language model and use these in our proposed consistency loss function. We conduct extensive experiments on the VQA Introspect and DME datasets and show that our method brings improvements to state-of-the-art VQA models, while being robust across different architectures and settings.
Published: 2023

23. Stochastic Segmentation with Conditional Categorical Diffusion Models

Author: Zbinden, Lukas, Doorenbos, Lars, Pissas, Theodoros, Huber, Adrian Thomas, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic segmentation has made significant progress in recent years thanks to deep neural networks, but the common objective of generating a single segmentation output that accurately matches the image's content may not be suitable for safety-critical domains such as medical diagnostics and autonomous driving. Instead, multiple possible correct segmentation maps may be required to reflect the true distribution of annotation maps. In this context, stochastic semantic segmentation methods must learn to predict conditional distributions of labels given the image, but this is challenging due to the typically multimodal distributions, high-dimensional output spaces, and limited annotation data. To address these challenges, we propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models. Our model is conditioned to the input image, enabling it to generate multiple segmentation label maps that account for the aleatoric uncertainty arising from divergent ground truth annotations. Our experimental results show that CCDM achieves state-of-the-art performance on LIDC, a stochastic semantic segmentation dataset, and outperforms established baselines on the classical segmentation dataset Cityscapes., Comment: Accepted at ICCV 2023. Code available at https://github.com/LarsDoorenbos/ccdm-stochastic-segmentation
Published: 2023

24. Generating astronomical spectra from photometry with conditional diffusion models

Author: Doorenbos, Lars, Cavuoti, Stefano, Longo, Giuseppe, Brescia, Massimo, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics
Abstract: A trade-off between speed and information controls our understanding of astronomical objects. Fast-to-acquire photometric observations provide global properties, while costly and time-consuming spectroscopic measurements enable a better understanding of the physics governing their evolution. Here, we tackle this problem by generating spectra directly from photometry, through which we obtain an estimate of their intricacies from easily acquired images. This is done by using multi-modal conditional diffusion models, where the best out of the generated spectra is selected with a contrastive network. Initial experiments on minimally processed SDSS galaxy data show promising results., Comment: Accepted at NeurIPS 2022 Machine Learning and the Physical Sciences workshop
Published: 2022

25. ULISSE: A Tool for One-shot Sky Exploration and its Application to Active Galactic Nuclei Detection

Author: Doorenbos, Lars, Torbaniuk, Olena, Cavuoti, Stefano, Paolillo, Maurizio, Longo, Giuseppe, Brescia, Massimo, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Computer Science - Computer Vision and Pattern Recognition
Abstract: Modern sky surveys are producing ever larger amounts of observational data, which makes the application of classical approaches for the classification and analysis of objects challenging and time-consuming. However, this issue may be significantly mitigated by the application of automatic machine and deep learning methods. We propose ULISSE, a new deep learning tool that, starting from a single prototype object, is capable of identifying objects sharing the same morphological and photometric properties, and hence of creating a list of candidate sosia. In this work, we focus on applying our method to the detection of AGN candidates in a Sloan Digital Sky Survey galaxy sample, since the identification and classification of Active Galactic Nuclei (AGN) in the optical band still remains a challenging task in extragalactic astronomy. Intended for the initial exploration of large sky surveys, ULISSE directly uses features extracted from the ImageNet dataset to perform a similarity search. The method is capable of rapidly identifying a list of candidates, starting from only a single image of a given prototype, without the need for any time-consuming neural network training. Our experiments show ULISSE is able to identify AGN candidates based on a combination of host galaxy morphology, color and the presence of a central nuclear source, with a retrieval efficiency ranging from 21% to 65% (including composite sources) depending on the prototype, where the random guess baseline is 12%. We find ULISSE to be most effective in retrieving AGN in early-type host galaxies, as opposed to prototypes with spiral- or late-type properties. Based on the results described in this work, ULISSE can be a promising tool for selecting different types of astrophysical objects in current and future wide-field surveys (e.g. Euclid, LSST etc.) that target millions of sources every single night., Comment: Accepted for publication in A&A
Published: 2022
Full Text: View/download PDF

26. DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos

Author: Ghamsarian, Negin, Taschwer, Mario, Sznitman, Raphael, and Schoeffmann, Klaus
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with these challenges using three novelties: (1) a Pyramid View Fusion module which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (2) a Deformable Pyramid Reception module which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (3) a dedicated Pyramid Loss that adaptively supervises multi-scale semantic feature maps. Combined, we show that these modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. We demonstrate that our approach performs at a state-of-the-art level and outperforms a number of existing methods with a large margin (3.66% overall improvement in intersection over union compared to the best rival approach)., Comment: 11 pages, 4 figures, accepted at 25th international conference on Medical Image Computing & Computer Assisted Intervention (MICCAI 2022). arXiv admin note: substantial text overlap with arXiv:2109.05352
Published: 2022

27. Consistency-preserving Visual Question Answering in Medical Imaging

Author: Tascon-Morales, Sergio, Márquez-Neila, Pablo, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Visual Question Answering (VQA) models take an image and a natural-language question as input and infer the answer to the question. Recently, VQA systems in medical imaging have gained popularity thanks to potential advantages such as patient engagement and second opinions for clinicians. While most research efforts have been focused on improving architectures and overcoming data-related limitations, answer consistency has been overlooked even though it plays a critical role in establishing trustworthy models. In this work, we propose a novel loss function and corresponding training procedure that allows the inclusion of relations between questions into the training process. Specifically, we consider the case where implications between perception and reasoning questions are known a-priori. To show the benefits of our approach, we evaluate it on the clinically relevant task of Diabetic Macular Edema (DME) staging from fundus imaging. Our experiments show that our method outperforms state-of-the-art baselines, not only by improving model consistency, but also in terms of overall model accuracy. Our code and data are available at https://github.com/sergiotasconmorales/consistency_vqa., Comment: Appears in Medical Image Computing and Computer Assisted Interventions (MICCAI), 2022
Published: 2022

28. Using domain knowledge for robust and generalizable deep learning-based CT-free PET attenuation and scatter correction.

Author: Guo, Rui, Xue, Song, Hu, Jiaxi, Sari, Hasan, Zeimpekis, Konstantinos, Prenosil, George, Wang, Yue, Zhang, Yu, Viscione, Marco, Sznitman, Raphael, Rominger, Axel, Li, Biao, Shi, Kuangyu, and Mingels, Clemens
Subjects: Deep Learning, Image Processing, Computer-Assisted, Magnetic Resonance Imaging, Positron Emission Tomography Computed Tomography, Positron-Emission Tomography
Abstract: Despite the potential of deep learning (DL)-based methods in substituting CT-based PET attenuation and scatter correction for CT-free PET imaging, a critical bottleneck is their limited capability in handling large heterogeneity of tracers and scanners of PET imaging. This study employs a simple way to integrate domain knowledge in DL for CT-free PET imaging. In contrast to conventional direct DL methods, we simplify the complex problem by a domain decomposition so that the learning of anatomy-dependent attenuation correction can be achieved robustly in a low-frequency domain while the original anatomy-independent high-frequency texture can be preserved during the processing. Even with the training from one tracer on one scanner, the effectiveness and robustness of our proposed approach are confirmed in tests of various external imaging tracers on different scanners. The robust, generalizable, and transparent DL development may enhance the potential of clinical translation.
Published: 2022

29. Predicting OCT biological marker localization from weak annotations

Author: Tejero, Javier Gamazo, Neila, Pablo Márquez, Kurmann, Thomas, Gallardo, Mathias, Zinkernagel, Martin, Wolf, Sebastian, and Sznitman, Raphael
Published: 2023
Full Text: View/download PDF

30. Müller matrix polarimetry for pancreatic tissue characterization

Author: Sampaio, Paulo, Lopez-Antuña, Maria, Storni, Federico, Wicht, Jonatan, Sökeland, Greta, Wartenberg, Martin, Márquez-Neila, Pablo, Candinas, Daniel, Demory, Brice-Olivier, Perren, Aurel, and Sznitman, Raphael
Published: 2023
Full Text: View/download PDF

31. A deep learning-based approach for efficient detection and classification of local Ca²⁺ release events in Full-Frame confocal imaging

Author: Dotti, Prisca, Fernandez-Tenorio, Miguel, Janicek, Radoslav, Márquez-Neila, Pablo, Wullschleger, Marcel, Sznitman, Raphael, and Egger, Marcel
Published: 2024
Full Text: View/download PDF

32. Data Invariants to Understand Unsupervised Out-of-Distribution Detection

Author: Doorenbos, Lars, Sznitman, Raphael, and Márquez-Neila, Pablo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Unsupervised out-of-distribution (U-OOD) detection has recently attracted much attention due its importance in mission-critical systems and broader applicability over its supervised counterpart. Despite this increase in attention, U-OOD methods suffer from important shortcomings. By performing a large-scale evaluation on different benchmarks and image modalities, we show in this work that most popular state-of-the-art methods are unable to consistently outperform a simple anomaly detector based on pre-trained features and the Mahalanobis distance (MahaAD). A key reason for the inconsistencies of these methods is the lack of a formal description of U-OOD. Motivated by a simple thought experiment, we propose a characterization of U-OOD based on the invariants of the training dataset. We show how this characterization is unknowingly embodied in the top-scoring MahaAD method, thereby explaining its quality. Furthermore, our approach can be used to interpret predictions of U-OOD detectors and provides insights into good practices for evaluating future U-OOD methods., Comment: ECCV 2022
Published: 2021

33. A Positive/Unlabeled Approach for the Segmentation of Medical Sequences using Point-Wise Supervision

Author: Lejeune, Laurent and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The ability to quickly annotate medical imaging data plays a critical role in training deep learning frameworks for segmentation. Doing so for image volumes or video sequences is even more pressing as annotating these is particularly burdensome. To alleviate this problem, this work proposes a new method to efficiently segment medical imaging volumes or videos using point-wise annotations only. This allows annotations to be collected extremely quickly and remains applicable to numerous segmentation tasks. Our approach trains a deep learning model using an appropriate Positive/Unlabeled objective function using sparse point-wise annotations. While most methods of this kind assume that the proportion of positive samples in the data is known a-priori, we introduce a novel self-supervised method to estimate this prior efficiently by combining a Bayesian estimation framework and new stopping criteria. Our method iteratively estimates appropriate class priors and yields high segmentation quality for a variety of object types and imaging modalities. In addition, by leveraging a spatio-temporal tracking framework, we regularize our predictions by leveraging the complete data volume. We show experimentally that our approach outperforms state-of-the-art methods tailored to the same problem.
Published: 2021

34. CataNet: Predicting remaining cataract surgery duration

Author: Marafioti, Andrés, Hayoz, Michel, Gallardo, Mathias, Neila, Pablo Márquez, Wolf, Sebastian, Zinkernagel, Martin, and Sznitman, Raphael
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Cataract surgery is a sight saving surgery that is performed over 10 million times each year around the world. With such a large demand, the ability to organize surgical wards and operating rooms efficiently is critical to delivery this therapy in routine clinical care. In this context, estimating the remaining surgical duration (RSD) during procedures is one way to help streamline patient throughput and workflows. To this end, we propose CataNet, a method for cataract surgeries that predicts in real time the RSD jointly with two influential elements: the surgeon's experience, and the current phase of the surgery. We compare CataNet to state-of-the-art RSD estimation methods, showing that it outperforms them even when phase and experience are not considered. We investigate this improvement and show that a significant contributor is the way we integrate the elapsed time into CataNet's feature extractor., Comment: Accepted at MICCAI 2021
Published: 2021

35. Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

Author: Allan, Max, Mcleod, Jonathan, Wang, Congcong, Rosenthal, Jean Claude, Hu, Zhenglei, Gard, Niklas, Eisert, Peter, Fu, Ke Xue, Zeffiro, Trevor, Xia, Wenyao, Zhu, Zhanshi, Luo, Huoling, Jia, Fucang, Zhang, Xiran, Li, Xiaohong, Sharan, Lalith, Kurmann, Tom, Schmid, Sebastian, Sznitman, Raphael, Psychogyios, Dimitris, Azizian, Mahdi, Stoyanov, Danail, Maier-Hein, Lena, and Speidel, Stefanie
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The stereo correspondence and reconstruction of endoscopic data sub-challenge was organized during the Endovis challenge at MICCAI 2019 in Shenzhen, China. The task was to perform dense depth estimation using 7 training datasets and 2 test sets of structured light data captured using porcine cadavers. These were provided by a team at Intuitive Surgical. 10 teams participated in the challenge day. This paper contains 3 additional methods which were submitted after the challenge finished as well as a supplemental section from these teams on issues they found with the dataset.
Published: 2021

36. BASELINE SPECTRAL DOMAIN OPTICAL COHERENCE TOMOGRAPHIC RETINAL LAYER FEATURES IDENTIFIED BY ARTIFICIAL INTELLIGENCE PREDICT THE COURSE OF CENTRAL SEROUS CHORIORETINOPATHY

Author: Ferro Desideri, Lorenzo, Anguita, Rodrigo, Berger, Lieselotte E., Feenstra, Helena M. A., Scandella, Davide, Sznitman, Raphael, Boon, Camiel J. F., van Dijk, Elon H. C., and Zinkernagel, Martin S.
Published: 2024
Full Text: View/download PDF

37. Surgical Data Science -- from Concepts toward Clinical Translation

Author: Maier-Hein, Lena, Eisenmann, Matthias, Sarikaya, Duygu, März, Keno, Collins, Toby, Malpani, Anand, Fallert, Johannes, Feussner, Hubertus, Giannarou, Stamatia, Mascagni, Pietro, Nakawala, Hirenkumar, Park, Adrian, Pugh, Carla, Stoyanov, Danail, Vedula, Swaroop S., Cleary, Kevin, Fichtinger, Gabor, Forestier, Germain, Gibaud, Bernard, Grantcharov, Teodor, Hashizume, Makoto, Heckmann-Nötzel, Doreen, Kenngott, Hannes G., Kikinis, Ron, Mündermann, Lars, Navab, Nassir, Onogur, Sinan, Sznitman, Raphael, Taylor, Russell H., Tizabi, Minu D., Wagner, Martin, Hager, Gregory D., Neumuth, Thomas, Padoy, Nicolas, Collins, Justin, Gockel, Ines, Goedeke, Jan, Hashimoto, Daniel A., Joyeux, Luc, Lam, Kyle, Leff, Daniel R., Madani, Amin, Marcus, Hani J., Meireles, Ozanan, Seitel, Alexander, Teber, Dogu, Ückert, Frank, Müller-Stich, Beat P., Jannin, Pierre, and Speidel, Stefanie
Subjects: Computer Science - Computers and Society, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applications have been studied in the fields of radiological and clinical data science, translational success stories are still lacking in surgery. In this publication, we shed light on the underlying reasons and provide a roadmap for future advances in the field. Based on an international workshop involving leading researchers in the field of SDS, we review current practice, key achievements and initiatives as well as available standards and tools for a number of topics relevant to the field, namely (1) infrastructure for data acquisition, storage and access in the presence of regulatory constraints, (2) data annotation and sharing and (3) data analytics. We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process.
Published: 2020

38. Information content of JWST-NIRSPEC transmission spectra of warm Neptunes

Author: Guzmán-Mesa, Andrea, Kitzmann, Daniel, Fisher, Chloe, Burgasser, Adam J., Hoeijmakers, H. Jens, Márquez-Neila, Pablo, Grimm, Simon L., Mandell, Avi M., Sznitman, Raphael, and Heng, Kevin
Subjects: Astrophysics - Earth and Planetary Astrophysics
Abstract: Warm Neptunes offer a rich opportunity for understanding exo-atmospheric chemistry. With the upcoming James Webb Space Telescope (JWST), there is a need to elucidate the balance between investments in telescope time versus scientific yield. We use the supervised machine learning method of the random forest to perform an information content analysis on a 11-parameter model of transmission spectra from the various NIRSpec modes. The three bluest medium-resolution NIRSpec modes (0.7 - 1.27 microns, 0.97 - 1.84 microns, 1.66 - 3.07 microns) are insensitive to the presence of CO. The reddest medium-resolution mode (2.87 - 5.10 microns) is sensitive to all of the molecules assumed in our model: CO, CO2, CH4, C2H2, H2O, HCN and NH3. It competes effectively with the three bluest modes on the information encoded on cloud abundance and particle size. It is also competitive with the low-resolution prism mode (0.6 - 5.3 microns) on the inference of every parameter except for the temperature and ammonia abundance. We recommend astronomers to use the reddest medium-resolution NIRSpec mode for studying the atmospheric chemistry of 800-1200 K warm Neptunes; its corresponding high-resolution counterpart offers diminishing returns. We compare our findings to previous JWST information content analyses that favor the blue orders, and suggest that the reliance on chemical equilibrium could lead to biased outcomes if this assumption does not apply. A simple, pressure-independent diagnostic for identifying chemical disequilibrium is proposed based on measuring the abundances of H2O, CO and CO2., Comment: 21 pages, 17 figures, 3 tables. Accepted by AJ
Published: 2020
Full Text: View/download PDF

39. A Question-Centric Model for Visual Question Answering in Medical Imaging

Author: Vu, Minh H., Löfstedt, Tommy, Nyholm, Tufve, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing, Statistics - Machine Learning
Abstract: Deep learning methods have proven extremely effective at performing a variety of medical image analysis tasks. With their potential use in clinical routine, their lack of transparency has however been one of their few weak points, raising concerns regarding their behavior and failure modes. While most research to infer model behavior has focused on indirect strategies that estimate prediction uncertainties and visualize model support in the input image space, the ability to explicitly query a prediction model regarding its image content offers a more direct way to determine the behavior of trained models. To this end, we present a novel Visual Question Answering approach that allows an image to be queried by means of a written question. Experiments on a variety of medical and natural image datasets show that by fusing image and question features in a novel way, the proposed approach achieves an equal or higher accuracy compared to current methods., Comment: Accepted at IEEE Transactions on Medical Imaging
Published: 2020
Full Text: View/download PDF

40. Supervised Machine Learning for Intercomparison of Model Grids of Brown Dwarfs: Application to GJ 570D and the Epsilon Indi B Binary System

Author: Oreshenko, Maria, Kitzmann, Daniel, Marquez-Neila, Pablo, Malik, Matej, Bowler, Brendan P., Burgasser, Adam J., Sznitman, Raphael, Fisher, Chloe E., and Heng, Kevin
Subjects: Astrophysics - Solar and Stellar Astrophysics, Astrophysics - Earth and Planetary Astrophysics
Abstract: Self-consistent model grids of brown dwarfs involve complex physics and chemistry, and are often computed using proprietary computer codes, making it challenging to identify the reasons for discrepancies between model and data as well as between the models produced by different research groups. In the current study, we demonstrate a novel method for analyzing brown dwarf spectra, which combines the use of the Sonora, AMES-Cond and HELIOS model grids with the supervised machine learning method of the random forest. Besides performing atmospheric retrieval, the random forest enables information content analysis of the three model grids as a natural outcome of the method, both individually on each grid and by comparing the grids against one another, via computing large suites of mock retrievals. Our analysis reveals that the different choices made in modelling the alkali line shapes hinder the use of the alkali lines as gravity indicators. Nevertheless, the spectrum longward of 1.2 micron encodes enough information on the surface gravity to allow its inference from retrieval. Temperature may be accurately and precisely inferred independent of the choice of model grid, but not the surface gravity. We apply random forest retrieval to three objects: the benchmark T7.5 brown dwarf GJ 570D; and Epsilon Indi Ba (T1.5 brown dwarf) and Bb (T6 brown dwarf), which are part of a binary system and have measured dynamical masses. For GJ 570D, the inferred effective temperature and surface gravity are consistent with previous studies. For Epsilon Indi Ba and Bb, the inferred surface gravities are broadly consistent with the values informed by the dynamical masses., Comment: Accepted for publication in The Astronomical Journal
Published: 2019
Full Text: View/download PDF

41. Interpreting High-Resolution Spectroscopy of Exoplanets Using Cross-Correlations and Supervised Machine Learning

Author: Fisher, Chloe, Hoeijmakers, H. Jens, Kitzmann, Daniel, Márquez-Neila, Pablo, Grimm, Simon L., Sznitman, Raphael, and Heng, Kevin
Subjects: Astrophysics - Earth and Planetary Astrophysics
Abstract: We present a new method for performing atmospheric retrieval on ground-based, high-resolution data of exoplanets. Our method combines cross-correlation functions with a random forest, a supervised machine learning technique, to overcome challenges associated with high-resolution data. A series of cross-correlation functions are concatenated to give a "CCF-sequence" for each model atmosphere, which reduces the dimensionality by a factor of ~100. The random forest, trained on our grid of ~65,000 models, provides a likelihood-free method of retrieval. The pre-computed grid spans 31 values of both temperature and metallicity, and incorporates a realistic noise model. We apply our method to HARPS-N observations of the ultra-hot Jupiter KELT-9b, and obtain a metallicity consistent with solar (logM = $-0.2\pm0.2$). Our retrieved transit chord temperature (T = $6000^{+0}_{-200}$K) is unreliable as the ion cross-correlations lie outside of the training set, which we interpret as being indicative of missing physics in our atmospheric model. We compare our method to traditional nested-sampling, as well as other machine learning techniques, such as Bayesian neural networks. We demonstrate that the likelihood-free aspect of the random forest makes it more robust than nested-sampling to different error distributions, and that the Bayesian neural network we tested is unable to reproduce complex posteriors. We also address the claim in Cobb et al. (2019) that our random forest retrieval technique can be over-confident but incorrect. We show that this is an artefact of the training set, rather than the machine learning method, and that the posteriors agree with those obtained using nested-sampling., Comment: 15 pages, 18 figures
Published: 2019
Full Text: View/download PDF

42. Analysis of optical coherence tomography biomarker probability detection in central serous chorioretinopathy by using an artificial intelligence-based biomarker detector

Author: Ferro Desideri, Lorenzo, Anguita, Rodrigo, Berger, Lieselotte E., Feenstra, Helena M. A., Scandella, Davide, Sznitman, Raphael, Boon, Camiel J. F., van Dijk, Elon H. C., and Zinkernagel, Martin S.
Published: 2024
Full Text: View/download PDF

43. Cataract-1K Dataset for Deep-Learning-Assisted Analysis of Cataract Surgery Videos

Author: Ghamsarian, Negin, El-Shabrawi, Yosuf, Nasirihaghighi, Sahar, Putzgruber-Adamitsch, Doris, Zinkernagel, Martin, Wolf, Sebastian, Schoeffmann, Klaus, and Sznitman, Raphael
Published: 2024
Full Text: View/download PDF

44. Automated liver segmental volume ratio quantification on non-contrast T1–Vibe Dixon liver MRI using deep learning

Author: Zbinden, Lukas, Catucci, Damiano, Suter, Yannick, Hulbert, Leona, Berzigotti, Annalisa, Brönnimann, Michael, Ebner, Lukas, Christe, Andreas, Obmann, Verena Carola, Sznitman, Raphael, and Huber, Adrian Thomas
Published: 2023
Full Text: View/download PDF

45. Machine learning for predicting Plasmodium liver stage development in vitro using microscopy imaging

Author: Otesteanu, Corin F., Caldelari, Reto, Heussler, Volker, and Sznitman, Raphael
Published: 2024
Full Text: View/download PDF

46. Domain Adaptation for Medical Image Segmentation Using Transformation-Invariant Self-training

Author: Ghamsarian, Negin, primary, Gamazo Tejero, Javier, additional, Márquez-Neila, Pablo, additional, Wolf, Sebastian, additional, Zinkernagel, Martin, additional, Schoeffmann, Klaus, additional, and Sznitman, Raphael, additional
Published: 2023
Full Text: View/download PDF

47. Localized Questions in Medical Visual Question Answering

Author: Tascon-Morales, Sergio, primary, Márquez-Neila, Pablo, additional, and Sznitman, Raphael, additional
Published: 2023
Full Text: View/download PDF

48. Geometric Ultrasound Localization Microscopy

Author: Hahne, Christopher, primary and Sznitman, Raphael, additional
Published: 2023
Full Text: View/download PDF

49. Fused Detection of Retinal Biomarkers in OCT Volumes

Author: Kurmann, Thomas, Márquez-Neila, Pablo, Yu, Siqing, Munk, Marion, Wolf, Sebastian, and Sznitman, Raphael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Optical Coherence Tomography (OCT) is the primary imaging modality for detecting pathological biomarkers associated to retinal diseases such as Age-Related Macular Degeneration. In practice, clinical diagnosis and treatment strategies are closely linked to biomarkers visible in OCT volumes and the ability to identify these plays an important role in the development of ophthalmic pharmaceutical products. In this context, we present a method that automatically predicts the presence of biomarkers in OCT cross-sections by incorporating information from the entire volume. We do so by adding a bidirectional LSTM to fuse the outputs of a Convolutional Neural Network that predicts individual biomarkers. We thus avoid the need to use pixel-wise annotations to train our method, and instead provide fine-grained biomarker information regardless. On a dataset of 416 volumes, we show that our approach imposes coherence between biomarker predictions across volume slices and our predictions are superior to several existing approaches.
Published: 2019

50. Concept-Centric Visual Turing Tests for Method Validation

Author: Fountoukidou, Tatiana and Sznitman, Raphael
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing, Statistics - Machine Learning
Abstract: Recent advances in machine learning for medical imaging have led to impressive increases in model complexity and overall capabilities. However, the ability to discern the precise information a machine learning method is using to make decisions has lagged behind and it is often unclear how these performances are in fact achieved. Conventional evaluation metrics that reduce method performance to a single number or a curve only provide limited insights. Yet, systems used in clinical practice demand thorough validation that such crude characterizations miss. To this end, we present a framework to evaluate classification methods based on a number of interpretable concepts that are crucial for a clinical task. Our approach is inspired by the Turing Test concept and how to devise a test that adaptively questions a method for its ability to interpret medical images. To do this, we make use of a Twenty Questions paradigm whereby we use a probabilistic model to characterize the method's capacity to grasp task-specific concepts, and we introduce a strategy to sequentially query the method according to its previous answers. The results show that the probabilistic model is able to expose both the dataset's and the method's biases, and can be used to reduced the number of queries needed for confident performance evaluation., Comment: 9 pages, 8 figures
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

575 results on '"Sznitman, Raphael"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources