Author: "Niethammer, Marc" / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Niethammer, Marc"' showing total 11 results

Start Over Author "Niethammer, Marc" Publication Year Range This year

11 results on '"Niethammer, Marc"'

1. LiVOS: Light Video Object Segmentation with Gated Linear Matching

Author: Liu, Qin, Wang, Jianfeng, Yang, Zhengyuan, Li, Linjie, Lin, Kevin, Niethammer, Marc, and Wang, Lijuan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semi-supervised video object segmentation (VOS) has been largely driven by space-time memory (STM) networks, which store past frame features in a spatiotemporal memory to segment the current frame via softmax attention. However, STM networks face memory limitations due to the quadratic complexity of softmax matching, restricting their applicability as video length and resolution increase. To address this, we propose LiVOS, a lightweight memory network that employs linear matching via linear attention, reformulating memory matching into a recurrent process that reduces the quadratic attention matrix to a constant-size, spatiotemporal-agnostic 2D state. To enhance selectivity, we introduce gated linear matching, where a data-dependent gate matrix is multiplied with the state matrix to control what information to retain or discard. Experiments on diverse benchmarks demonstrated the effectiveness of our method. It achieved 64.8 J&F on MOSE and 85.1 J&F on DAVIS, surpassing all non-STM methods and narrowing the gap with STM-based approaches. For longer and higher-resolution videos, it matched STM-based methods with 53% less GPU memory and supports 4096p inference on a 32G consumer-grade GPU--a previously cost-prohibitive capability--opening the door for long and high-resolution video foundation models., Comment: Code&models: https://github.com/uncbiag/LiVOS
Published: 2024

2. multiGradICON: A Foundation Model for Multimodal Medical Image Registration

Author: Demir, Basar, Tian, Lin, Greer, Thomas Hastings, Kwitt, Roland, Vialard, Francois-Xavier, Estepar, Raul San Jose, Bouix, Sylvain, Rushmore, Richard Jarrett, Ebrahim, Ebrahim, and Niethammer, Marc
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Modern medical image registration approaches predict deformations using deep networks. These approaches achieve state-of-the-art (SOTA) registration accuracy and are generally fast. However, deep learning (DL) approaches are, in contrast to conventional non-deep-learning-based approaches, anatomy-specific. Recently, a universal deep registration approach, uniGradICON, has been proposed. However, uniGradICON focuses on monomodal image registration. In this work, we therefore develop multiGradICON as a first step towards universal *multimodal* medical image registration. Specifically, we show that 1) we can train a DL registration model that is suitable for monomodal *and* multimodal registration; 2) loss function randomization can increase multimodal registration accuracy; and 3) training a model with multimodal data helps multimodal generalization. Our code and the multiGradICON model are available at https://github.com/uncbiag/uniGradICON.
Published: 2024

3. CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

Author: Xia, Peng, Chen, Ze, Tian, Juanxi, Gong, Yangrui, Hou, Ruibo, Xu, Yue, Wu, Zhenbang, Fan, Zhiyuan, Zhou, Yiyang, Zhu, Kangyu, Zheng, Wenhao, Wang, Zhaoyang, Wang, Xiao, Zhang, Xuchao, Bansal, Chetan, Niethammer, Marc, Huang, Junzhou, Zhu, Hongtu, Li, Yun, Sun, Jimeng, Ge, Zongyuan, Li, Gang, Zou, James, and Yao, Huaxiu
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computers and Society
Abstract: Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare. However, the trustworthiness of Med-LVLMs remains unverified, posing significant risks for future model deployment. In this paper, we introduce CARES and aim to comprehensively evaluate the Trustworthiness of Med-LVLMs across the medical domain. We assess the trustworthiness of Med-LVLMs across five dimensions, including trustfulness, fairness, safety, privacy, and robustness. CARES comprises about 41K question-answer pairs in both closed and open-ended formats, covering 16 medical image modalities and 27 anatomical regions. Our analysis reveals that the models consistently exhibit concerns regarding trustworthiness, often displaying factual inaccuracies and failing to maintain fairness across different demographic groups. Furthermore, they are vulnerable to attacks and demonstrate a lack of privacy awareness. We publicly release our benchmark and code in https://cares-ai.github.io/., Comment: NeurIPS 2024 Datasets and Benchmarks Track
Published: 2024

4. CARL: A Framework for Equivariant Image Registration

Author: Greer, Hastings, Tian, Lin, Vialard, Francois-Xavier, Kwitt, Roland, Estepar, Raul San Jose, and Niethammer, Marc
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image registration estimates spatial correspondences between a pair of images. These estimates are typically obtained via numerical optimization or regression by a deep network. A desirable property of such estimators is that a correspondence estimate (e.g., the true oracle correspondence) for an image pair is maintained under deformations of the input images. Formally, the estimator should be equivariant to a desired class of image transformations. In this work, we present careful analyses of the desired equivariance properties in the context of multi-step deep registration networks. Based on these analyses we 1) introduce the notions of $[U,U]$ equivariance (network equivariance to the same deformations of the input images) and $[W,U]$ equivariance (where input images can undergo different deformations); we 2) show that in a suitable multi-step registration setup it is sufficient for overall $[W,U]$ equivariance if the first step has $[W,U]$ equivariance and all others have $[U,U]$ equivariance; we 3) show that common displacement-predicting networks only exhibit $[U,U]$ equivariance to translations instead of the more powerful $[W,U]$ equivariance; and we 4) show how to achieve multi-step $[W,U]$ equivariance via a coordinate-attention mechanism combined with displacement-predicting refinement layers (CARL). Overall, our approach obtains excellent practical registration performance on several 3D medical image registration tasks and outperforms existing unsupervised approaches for the challenging problem of abdomen registration.
Published: 2024

5. Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts

Author: Liu, Qin, Cho, Jaemin, Bansal, Mohit, and Niethammer, Marc
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The goal of interactive image segmentation is to delineate specific regions within an image via visual or language prompts. Low-latency and high-quality interactive segmentation with diverse prompts remain challenging for existing specialist and generalist models. Specialist models, with their limited prompts and task-specific designs, experience high latency because the image must be recomputed every time the prompt is updated, due to the joint encoding of image and visual prompts. Generalist models, exemplified by the Segment Anything Model (SAM), have recently excelled in prompt diversity and efficiency, lifting image segmentation to the foundation model era. However, for high-quality segmentations, SAM still lags behind state-of-the-art specialist models despite SAM being trained with x100 more segmentation masks. In this work, we delve deep into the architectural differences between the two types of models. We observe that dense representation and fusion of visual prompts are the key design choices contributing to the high segmentation quality of specialist models. In light of this, we reintroduce this dense design into the generalist models, to facilitate the development of generalist models with high segmentation quality. To densely represent diverse visual prompts, we propose to use a dense map to capture five types: clicks, boxes, polygons, scribbles, and masks. Thus, we propose SegNext, a next-generation interactive segmentation approach offering low latency, high quality, and diverse prompt support. Our method outperforms current state-of-the-art methods on HQSeg-44K and DAVIS, both quantitatively and qualitatively., Comment: CVPR 2024 https://github.com/uncbiag/SegNext
Published: 2024

6. Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos

Author: Paruchuri, Akshay, Ehrenstein, Samuel, Wang, Shuxian, Fried, Inbar, Pizer, Stephen M., Niethammer, Marc, and Sengupta, Roni
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Monocular depth estimation in endoscopy videos can enable assistive and robotic surgery to obtain better coverage of the organ and detection of various health issues. Despite promising progress on mainstream, natural image depth estimation, techniques perform poorly on endoscopy images due to a lack of strong geometric features and challenging illumination effects. In this paper, we utilize the photometric cues, i.e., the light emitted from an endoscope and reflected by the surface, to improve monocular depth estimation. We first create two novel loss functions with supervised and self-supervised variants that utilize a per-pixel shading representation. We then propose a novel depth refinement network (PPSNet) that leverages the same per-pixel shading representation. Finally, we introduce teacher-student transfer learning to produce better depth maps from both synthetic data with supervision and clinical data with self-supervision. We achieve state-of-the-art results on the C3VD dataset while estimating high-quality depth maps from clinical data. Our code, pre-trained models, and supplementary materials can be found on our project page: https://ppsnet.github.io/, Comment: Accepted to ECCV 2024. 27 pages, 8 tables, 8 figures. Updated to include reference to clinical dataset
Published: 2024

7. A Unified Model for Longitudinal Multi-Modal Multi-View Prediction with Missingness

Author: Chen, Boqi, Oliva, Junier, and Niethammer, Marc
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Medical records often consist of different modalities, such as images, text, and tabular information. Integrating all modalities offers a holistic view of a patient's condition, while analyzing them longitudinally provides a better understanding of disease progression. However, real-world longitudinal medical records present challenges: 1) patients may lack some or all of the data for a specific timepoint, and 2) certain modalities or views might be absent for all patients during a particular period. In this work, we introduce a unified model for longitudinal multi-modal multi-view prediction with missingness. Our method allows as many timepoints as desired for input, and aims to leverage all available data, regardless of their availability. We conduct extensive experiments on the knee osteoarthritis dataset from the Osteoarthritis Initiative for pain and Kellgren-Lawrence grade prediction at a future timepoint. We demonstrate the effectiveness of our method by comparing results from our unified model to specific models that use the same modality and view combinations during training and evaluation. We also show the benefit of having extended temporal data and provide post-hoc analysis for a deeper understanding of each modality/view's importance for different tasks.
Published: 2024

8. NeuralOCT: Airway OCT Analysis via Neural Fields

Author: Jiao, Yining, Oldenburg, Amy, Xu, Yinghan, Soundararajan, Srikamal, Zdanski, Carlton, Kimbell, Julia, and Niethammer, Marc
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Optical coherence tomography (OCT) is a popular modality in ophthalmology and is also used intravascularly. Our interest in this work is OCT in the context of airway abnormalities in infants and children where the high resolution of OCT and the fact that it is radiation-free is important. The goal of airway OCT is to provide accurate estimates of airway geometry (in 2D and 3D) to assess airway abnormalities such as subglottic stenosis. We propose $\texttt{NeuralOCT}$, a learning-based approach to process airway OCT images. Specifically, $\texttt{NeuralOCT}$ extracts 3D geometries from OCT scans by robustly bridging two steps: point cloud extraction via 2D segmentation and 3D reconstruction from point clouds via neural fields. Our experiments show that $\texttt{NeuralOCT}$ produces accurate and robust 3D airway reconstructions with an average A-line error smaller than 70 micrometer. Our code will cbe available on GitHub.
Published: 2024

9. uniGradICON: A Foundation Model for Medical Image Registration

Author: Tian, Lin, Greer, Hastings, Kwitt, Roland, Vialard, Francois-Xavier, Estepar, Raul San Jose, Bouix, Sylvain, Rushmore, Richard, and Niethammer, Marc
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Conventional medical image registration approaches directly optimize over the parameters of a transformation model. These approaches have been highly successful and are used generically for registrations of different anatomical regions. Recent deep registration networks are incredibly fast and accurate but are only trained for specific tasks. Hence, they are no longer generic registration approaches. We therefore propose uniGradICON, a first step toward a foundation model for registration providing 1) great performance \emph{across} multiple datasets which is not feasible for current learning-based registration methods, 2) zero-shot capabilities for new registration tasks suitable for different acquisitions, anatomical regions, and modalities compared to the training dataset, and 3) a strong initialization for finetuning on out-of-distribution registration tasks. UniGradICON unifies the speed and accuracy benefits of learning-based registration algorithms with the generic applicability of conventional non-deep-learning approaches. We extensively trained and evaluated uniGradICON on twelve different public datasets. Our code and the uniGradICON model are available at https://github.com/uncbiag/uniGradICON.
Published: 2024

10. Investigating the relationship between radiographic joint space width loss and deep learning-derived magnetic resonance imaging-based cartilage thickness loss in the medial weight-bearing region of the tibiofemoral joint

Author: Minnig, Mary Catherine C., Arbeeva, Liubov, Niethammer, Marc, Nissman, Daniel, Lund, Jennifer L., Marron, J.S., Golightly, Yvonne M., and Nelson, Amanda E.
Published: 2024
Full Text: View/download PDF

11. Joint Depth Prediction and Semantic Segmentation with Multi-View SAM

Author: Shvets, Mykhailo, primary, Zhao, Dongxu, additional, Niethammer, Marc, additional, Sengupta, Roni, additional, and Berg, Alexander C., additional
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

11 results on '"Niethammer, Marc"'

1. LiVOS: Light Video Object Segmentation with Gated Linear Matching

2. multiGradICON: A Foundation Model for Multimodal Medical Image Registration

3. CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

4. CARL: A Framework for Equivariant Image Registration

5. Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts

6. Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos

7. A Unified Model for Longitudinal Multi-Modal Multi-View Prediction with Missingness

8. NeuralOCT: Airway OCT Analysis via Neural Fields

9. uniGradICON: A Foundation Model for Medical Image Registration

10. Investigating the relationship between radiographic joint space width loss and deep learning-derived magnetic resonance imaging-based cartilage thickness loss in the medial weight-bearing region of the tibiofemoral joint

11. Joint Depth Prediction and Semantic Segmentation with Multi-View SAM

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

11 results on '"Niethammer, Marc"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources