Author: "Kooij, Julian F. P." / Topic: computer vision and pattern recognition (cs.cv) - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kooij, Julian F. P."' showing total 5 results

Start Over Author "Kooij, Julian F. P." Topic computer vision and pattern recognition (cs.cv)

5 results on '"Kooij, Julian F. P."'

1. Convolutional Cross-View Pose Estimation

Author: Xia, Zimin, Booij, Olaf, and Kooij, Julian F. P.
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose a novel end-to-end method for cross-view pose estimation. Given a ground-level query image and an aerial image that covers the query's local neighborhood, the 3 Degrees-of-Freedom camera pose of the query is estimated by matching its image descriptor to descriptors of local regions within the aerial image. The orientation-aware descriptors are obtained by using a translational equivariant convolutional ground image encoder and contrastive learning. The Localization Decoder produces a dense probability distribution in a coarse-to-fine manner with a novel Localization Matching Upsampling module. A smaller Orientation Decoder produces a vector field to condition the orientation estimate on the localization. Our method is validated on the VIGOR and KITTI datasets, where it surpasses the state-of-the-art baseline by 72% and 36% in median localization error for comparable orientation estimation accuracy. The predicted probability distribution can represent localization ambiguity, and enables rejecting possible erroneous predictions. Without re-training, the model can infer on ground images with different field of views and utilize orientation priors if available. On the Oxford RobotCar dataset, our method can reliably estimate the ego-vehicle's pose over time, achieving a median localization error under 1 meter and a median orientation error of around 1 degree at 14 FPS.
Published: 2023
Full Text: View/download PDF

2. How do Cross-View and Cross-Modal Alignment Affect Representations in Contrastive Learning?

Author: Hehn, Thomas M., Kooij, Julian F. P., and Gavrila, Dariu M.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)
Abstract: Various state-of-the-art self-supervised visual representation learning approaches take advantage of data from multiple sensors by aligning the feature representations across views and/or modalities. In this work, we investigate how aligning representations affects the visual features obtained from cross-view and cross-modal contrastive learning on images and point clouds. On five real-world datasets and on five tasks, we train and evaluate 108 models based on four pretraining variations. We find that cross-modal representation alignment discards complementary visual information, such as color and texture, and instead emphasizes redundant depth cues. The depth cues obtained from pretraining improve downstream depth prediction performance. Also overall, cross-modal alignment leads to more robust encoders than pre-training by cross-view alignment, especially on depth prediction, instance segmentation, and object detection.
Published: 2022

3. Visual Cross-View Metric Localization with Dense Uncertainty Estimates

Author: Xia, Zimin, Booij, Olaf, Manfredi, Marco, and Kooij, Julian F. P.
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: This work addresses visual cross-view metric localization for outdoor robotics. Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch. Related work addressed this task for range-sensors (LiDAR, Radar), but for vision, only as a secondary regression step after an initial cross-view image retrieval step. Since the local satellite patch could also be retrieved through any rough localization prior (e.g. from GPS/GNSS, temporal filtering), we drop the image retrieval objective and focus on the metric localization only. We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck (rather than at the output as in image retrieval), and a dense spatial distribution as output to capture multi-modal localization ambiguities. We compare against a state-of-the-art regression baseline that uses global image descriptors. Quantitative and qualitative experimental results on the recently proposed VIGOR and the Oxford RobotCar datasets validate our design. The produced probabilities are correlated with localization accuracy, and can even be used to roughly estimate the ground camera's heading when its orientation is unknown. Overall, our method reduces the median metric localization error by 51%, 37%, and 28% compared to the state-of-the-art when generalizing respectively in the same area, across areas, and across time., ECCV 2022
Published: 2022

4. SliceMatch: Geometry-guided Aggregation for Cross-View Pose Estimation

Author: Lentsch, Ted, Xia, Zimin, Caesar, Holger, and Kooij, Julian F. P.
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: This work addresses cross-view camera pose estimation, i.e., determining the 3-Degrees-of-Freedom camera pose of a given ground-level image w.r.t. an aerial image of the local area. We propose SliceMatch, which consists of ground and aerial feature extractors, feature aggregators, and a pose predictor. The feature extractors extract dense features from the ground and aerial images. Given a set of candidate camera poses, the feature aggregators construct a single ground descriptor and a set of pose-dependent aerial descriptors. Notably, our novel aerial feature aggregator has a cross-view attention module for ground-view guided aerial feature selection and utilizes the geometric projection of the ground camera's viewing frustum on the aerial image to pool features. The efficient construction of aerial descriptors is achieved using precomputed masks. SliceMatch is trained using contrastive learning and pose estimation is formulated as a similarity comparison between the ground descriptor and the aerial descriptors. Compared to the state-of-the-art, SliceMatch achieves a 19% lower median localization error on the VIGOR benchmark using the same VGG16 backbone at 150 frames per second, and a 50% lower error when using a ResNet50 backbone.
Published: 2022
Full Text: View/download PDF

5. Cross-Modal Distillation for RGB-Depth Person Re-Identification

Author: Hafner, Frank, Bhuiyan, Amran, Kooij, Julian F. P., and Granger, Eric
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Person re-identification is a key challenge for surveillance across multiple sensors. Prompted by the advent of powerful deep learning models for visual recognition, and inexpensive RGB-D cameras and sensor-rich mobile robotic platforms, e.g. self-driving vehicles, we investigate the relatively unexplored problem of cross-modal re-identification of persons between RGB (color) and depth images. The considerable divergence in data distributions across different sensor modalities introduces additional challenges to the typical difficulties like distinct viewpoints, occlusions, and pose and illumination variation. While some work has investigated re-identification across RGB and infrared, we take inspiration from successes in transfer learning from RGB to depth in object detection tasks. Our main contribution is a novel method for cross-modal distillation for robust person re-identification, which learns a shared feature representation space of person's appearance in both RGB and depth images. In addition, we propose a cross-modal attention mechanism where the gating signal from one modality can dynamically activate the most discriminant CNN filters of the other modality. The proposed distillation method is compared to conventional and deep learning approaches proposed for other cross-domain re-identification tasks. Results obtained on the public BIWI and RobotPKU datasets indicate that the proposed method can significantly outperform the state-of-the-art approaches by up to 16.1% in mean Average Precision (mAP), demonstrating the benefit of the distillation paradigm. The experimental results also indicate that using cross-modal attention allows to improve recognition accuracy considerably with respect to the proposed distillation method and relevant state-of-the-art approaches.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

5 results on '"Kooij, Julian F. P."'

1. Convolutional Cross-View Pose Estimation

2. How do Cross-View and Cross-Modal Alignment Affect Representations in Contrastive Learning?

3. Visual Cross-View Metric Localization with Dense Uncertainty Estimates

4. SliceMatch: Geometry-guided Aggregation for Cross-View Pose Estimation

5. Cross-Modal Distillation for RGB-Depth Person Re-Identification

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

5 results on '"Kooij, Julian F. P."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources