Author: "Hartley, Richard" / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hartley, Richard"' showing total 12 results

Start Over Author "Hartley, Richard" Publication Year Range This year

12 results on '"Hartley, Richard"'

1. Learning-based Multi-View Stereo: A Survey

Author: Wang, Fangjinhua, Zhu, Qingtian, Chang, Di, Gao, Quankai, Han, Junlin, Zhang, Tong, Hartley, Richard, and Pollefeys, Marc
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D reconstruction aims to recover the dense 3D structure of a scene. It plays an essential role in various applications such as Augmented/Virtual Reality (AR/VR), autonomous driving and robotics. Leveraging multiple views of a scene captured from different viewpoints, Multi-View Stereo (MVS) algorithms synthesize a comprehensive 3D representation, enabling precise reconstruction in complex environments. Due to its efficiency and effectiveness, MVS has become a pivotal method for image-based 3D reconstruction. Recently, with the success of deep learning, many learning-based MVS methods have been proposed, achieving impressive performance against traditional methods. We categorize these learning-based methods as: depth map-based, voxel-based, NeRF-based, 3D Gaussian Splatting-based, and large feed-forward methods. Among these, we focus significantly on depth map-based methods, which are the main family of MVS due to their conciseness, flexibility and scalability. In this survey, we provide a comprehensive review of the literature at the time of this writing. We investigate these learning-based methods, summarize their performances on popular benchmarks, and discuss promising future research directions in this area.
Published: 2024

2. InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

Author: Zhang, Zeyu, Liu, Akide, Chen, Qi, Chen, Feng, Reid, Ian, Hartley, Richard, Zhuang, Bohan, and Tang, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Text-to-motion generation holds potential for film, gaming, and robotics, yet current methods often prioritize short motion generation, making it challenging to produce long motion sequences effectively: (1) Current methods struggle to handle long motion sequences as a single input due to prohibitively high computational cost; (2) Breaking down the generation of long motion sequences into shorter segments can result in inconsistent transitions and requires interpolation or inpainting, which lacks entire sequence modeling. To solve these challenges, we propose InfiniMotion, a method that generates continuous motion sequences of arbitrary length within an autoregressive framework. We highlight its groundbreaking capability by generating a continuous 1-hour human motion with around 80,000 frames. Specifically, we introduce the Motion Memory Transformer with Bidirectional Mamba Memory, enhancing the transformer's memory to process long motion sequences effectively without overwhelming computational resources. Notably our method achieves over 30% improvement in FID and 6 times longer demonstration compared to previous state-of-the-art methods, showcasing significant advancements in long motion generation. See project webpage: https://steve-zeyu-zhang.github.io/InfiniMotion/
Published: 2024

3. Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective

Author: Qin, Zhen, Shen, Xuyang, Li, Dong, Sun, Weigao, Birchfield, Stan, Hartley, Richard, and Zhong, Yiran
Subjects: Computer Science - Computation and Language
Abstract: We present the Linear Complexity Sequence Model (LCSM), a comprehensive solution that unites various sequence modeling techniques with linear complexity, including linear attention, state space model, long convolution, and linear RNN, within a single framework. The goal is to enhance comprehension of these models by analyzing the impact of each component from a cohesive and streamlined viewpoint. Specifically, we segment the modeling processes of these models into three distinct stages: Expand, Oscillation, and Shrink (EOS), with each model having its own specific settings. The Expand stage involves projecting the input signal onto a high-dimensional memory state. This is followed by recursive operations performed on the memory state in the Oscillation stage. Finally, the memory state is projected back to a low-dimensional space in the Shrink stage. We perform comprehensive experiments to analyze the impact of different stage settings on language modeling and retrieval tasks. Our results show that data-driven methods are crucial for the effectiveness of the three stages in language modeling, whereas hand-crafted methods yield better performance in retrieval tasks., Comment: Technical report. Yiran Zhong is the corresponding author
Published: 2024

4. Severity Controlled Text-to-Image Generative Model Bias Manipulation

Author: Vice, Jordan, Akhtar, Naveed, Hartley, Richard, and Mian, Ajmal
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Text-to-image (T2I) generative models are gaining wide popularity, especially in public domains. However, their intrinsic bias and potential malicious manipulations remain under-explored. Charting the susceptibility of T2I models to such manipulation, we first expose the new possibility of a dynamic and computationally efficient exploitation of model bias by targeting the embedded language models. By leveraging mathematical foundations of vector algebra, our technique enables a scalable and convenient control over the severity of output manipulation through model bias. As a by-product, this control also allows a form of precise prompt engineering to generate images which are generally implausible with regular text prompts. We also demonstrate a constructive application of our manipulation for balancing the frequency of generated classes - as in model debiasing. Our technique does not require training and is also framed as a backdoor attack with severity control using semantically-null text triggers in the prompts. With extensive analysis, we present interesting qualitative and quantitative results to expose potential manipulation possibilities for T2I models. Key-words: Text-to-Image Models, Generative Models, Backdoor Attacks, Prompt Engineering, Bias, Comment: This research was supported by National Intelligence and Security Discovery Research Grants (project# NS220100007), funded by the Department of Defence Australia
Published: 2024

5. Motion Mamba: Efficient and Long Sequence Motion Generation

Author: Zhang, Zeyu, Liu, Akide, Reid, Ian, Hartley, Richard, Zhuang, Bohan, and Tang, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Human motion generation stands as a significant pursuit in generative computer vision, while achieving long-sequence and efficient motion generation remains challenging. Recent advancements in state space models (SSMs), notably Mamba, have showcased considerable promise in long sequence modeling with an efficient hardware-aware design, which appears to be a promising direction to build motion generation model upon it. Nevertheless, adapting SSMs to motion generation faces hurdles since the lack of a specialized design architecture to model motion sequence. To address these challenges, we propose Motion Mamba, a simple and efficient approach that presents the pioneering motion generation model utilized SSMs. Specifically, we design a Hierarchical Temporal Mamba (HTM) block to process temporal data by ensemble varying numbers of isolated SSM modules across a symmetric U-Net architecture aimed at preserving motion consistency between frames. We also design a Bidirectional Spatial Mamba (BSM) block to bidirectionally process latent poses, to enhance accurate motion generation within a temporal frame. Our proposed method achieves up to 50% FID improvement and up to 4 times faster on the HumanML3D and KIT-ML datasets compared to the previous best diffusion-based method, which demonstrates strong capabilities of high-quality long sequence motion modeling and real-time human motion generation. See project website https://steve-zeyu-zhang.github.io/MotionMamba/, Comment: Accepted to ECCV 2024
Published: 2024

6. Variability of fallout radionuclides (FRNs) in river channels: implications for sediment tracing

Author: Muñoz-Arcos, Enrique, Millward, Geoffrey E., Clason, Caroline C., Hartley, Richard, Bravo-Linares, Claudio, and Blake, William H.
Published: 2024
Full Text: View/download PDF

7. Computer Vision

Author: Ahad, Md Atiqur Rahman, primary, Mahbub, Upal, additional, Turk, Matthew, additional, and Hartley, Richard, additional
Published: 2024
Full Text: View/download PDF

8. Human mitochondrial glutathione transferases: Kinetic parameters and accommodation of a mitochondria-targeting group in substrates

Author: Cardwell, Patrick A., Del Moro, Carlo, Murphy, Michael P., Lapthorn, Adrian J., and Hartley, Richard C.
Published: 2024
Full Text: View/download PDF

9. Drug discovery for ageing: SIMPs, NEDs and screening challenges

Author: Faragher, Richard G. A., primary and Hartley, Richard C., additional
Published: 2024
Full Text: View/download PDF

10. Physico-mechanical properties of halite and of gypsum-mudstone crushed rock backfills of the Mercia Mudstone Group

Author: Umeaghadi, Chibuike C., primary, Lucock, Thomas, additional, Hartley, Richard, additional, Berry, Jessica, additional, and Bailey, Matthew T., additional
Published: 2024
Full Text: View/download PDF

11. A multisite examination of women veterans in veterans treatment courts: a gendered comparison of demography, criminal history, program requirements, and substance use and mental health issues

Author: Hartley, Richard D., primary and Baldwin, Julie M., additional
Published: 2024
Full Text: View/download PDF

12. Weakly-supervised Depth Estimation and Image Deblurring via Dual-Pixel Sensors.

Author: Pan L, Hartley R, Liu L, Xu Z, Chowdhury S, Yang Y, Zhang H, Li H, and Liu M
Abstract: Dual-pixel (DP) imaging sensors are getting more popularly adopted by modern cameras. A DP camera captures a pair of images in a single snapshot by splitting each pixel in half. Several previous studies show how to recover depth information by treating the DP pair as an approximate stereo pair. However, dual-pixel disparity occurs only in image regions with defocus blur which is unlike classic stereo disparity. Heavy defocus blur in DP pairs affects the performance of depth estimation approaches based on matching. Therefore, we treat the blur removal and the depth estimation as a joint problem. We investigate the formation of the DP pair, which links the blur and depth information, rather than blindly removing the blur effect. We propose a mathematical DP model that can improve depth estimation by the blur. This exploration motivated us to propose our previous work, an end-to-end DDDNet (DP-based Depth and Deblur Network), which jointly estimates depth and restores the image in a supervised fashion. However, collecting the ground-truth (GT) depth map for the DP pair is challenging and limits the depth estimation potential of the DP sensor. Therefore, we propose an extension of the DDDNet, called WDDNet (Weakly-supervised Depth and Deblur Network), which includes an efficient reblur solver that does not require GT depth maps for training. To achieve this, we convert all-in-focus images into supervisory signals for unsupervised depth estimation in our WDDNet. We jointly estimate an all-in-focus image and a disparity map, then use a Reblur and Fstack module to regularize the disparity estimation and image restoration. We conducted extensive experiments on synthetic and real data to demonstrate the competitive performance of our method when compared to state-of-the-art (SOTA) supervised approaches. Index Terms-Dual-pixel Sensor, Weakly-supervised, Depth Estimation, Deblur and Reblu.
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"Hartley, Richard"'

1. Learning-based Multi-View Stereo: A Survey

2. InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

3. Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective

4. Severity Controlled Text-to-Image Generative Model Bias Manipulation

5. Motion Mamba: Efficient and Long Sequence Motion Generation

6. Variability of fallout radionuclides (FRNs) in river channels: implications for sediment tracing

7. Computer Vision

8. Human mitochondrial glutathione transferases: Kinetic parameters and accommodation of a mitochondria-targeting group in substrates

9. Drug discovery for ageing: SIMPs, NEDs and screening challenges

10. Physico-mechanical properties of halite and of gypsum-mudstone crushed rock backfills of the Mercia Mudstone Group

11. A multisite examination of women veterans in veterans treatment courts: a gendered comparison of demography, criminal history, program requirements, and substance use and mental health issues

12. Weakly-supervised Depth Estimation and Image Deblurring via Dual-Pixel Sensors.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

12 results on '"Hartley, Richard"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources