4,397 results on '"Gilbert, Andrew"'
Search Results
2. Interpretable Long-term Action Quality Assessment
- Author
-
Dong, Xu, Liu, Xinran, Li, Wanqing, Adeyemi-Ejeye, Anthony, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Long-term Action Quality Assessment (AQA) evaluates the execution of activities in videos. However, the length presents challenges in fine-grained interpretability, with current AQA methods typically producing a single score by averaging clip features, lacking detailed semantic meanings of individual clips. Long-term videos pose additional difficulty due to the complexity and diversity of actions, exacerbating interpretability challenges. While query-based transformer networks offer promising long-term modeling capabilities, their interpretability in AQA remains unsatisfactory due to a phenomenon we term Temporal Skipping, where the model skips self-attention layers to prevent output degradation. To address this, we propose an attention loss function and a query initialization method to enhance performance and interpretability. Additionally, we introduce a weight-score regression module designed to approximate the scoring patterns observed in human judgments and replace conventional single-score regression, improving the rationality of interpretability. Our approach achieves state-of-the-art results on three real-world, long-term AQA benchmarks. Our code is available at: https://github.com/dx199771/Interpretability-AQA, Comment: Accepted to British Machine Vision Conference (BMVC) 2024
- Published
- 2024
3. EFT Workshop at Notre Dame
- Author
-
Smith, Nick, Spitzbart, Daniel, Dickinson, Jennet, Wilson, Jon, Gray, Lindsey, Mohrman, Kelci, Bhattacharya, Saptaparna, Piccinelli, Andrea, Roy, Titas, Paspalaki, Garyfallia, Fontes, Duarte, Martin, Adam, Shepherd, William, Cruz, Sergio Sánchez, Goncalves, Dorival, Gritsan, Andrei, Prosper, Harrison, Junk, Tom, Cranmer, Kyle, Peskin, Michael, Gilbert, Andrew, Langford, Jonathon, Petriello, Frank, Mantani, Luca, Wightman, Andrew, Knight, Charlotte, Shyamsundar, Prasanth, Basnet, Aashwin, Boldrini, Giacomo, and Lannon, Kevin
- Subjects
High Energy Physics - Experiment - Abstract
The LPC EFT workshop was held April 25-26, 2024 at the University of Notre Dame. The workshop was organized into five thematic sessions: "how far beyond linear" discusses issues of truncation and validity in interpretation of results with an eye towards practicality; "reconstruction-level results" visits the question of how best to design analyses directly targeting inference of EFT parameters; "logistics of combining likelihoods" addresses the challenges of bringing a diverse array of measurements into a cohesive whole; "unfolded results" tackles the question of designing fiducial measurements for later use in EFT interpretations, and the benefits and limitations of unfolding; and "building a sample library" addresses how best to generate simulation samples for use in data analysis. This document serves as a summary of presentations, subsequent discussions, and actionable items identified over the course of the workshop.
- Published
- 2024
4. FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
- Author
-
Ahmadian, Mona, Guerin, Frank, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
This paper demonstrates a self-supervised approach for learning semantic video representations. Recent vision studies show that a masking strategy for vision and natural language supervision has contributed to developing transferable visual pretraining. Our goal is to achieve a more semantic video representation by leveraging the text related to the video content during the pretraining in a fully self-supervised manner. To this end, we present FILS, a novel self-supervised video Feature prediction In semantic Language Space (FILS). The vision model can capture valuable structured information by correctly predicting masked feature semantics in language space. It is learned using a patch-wise video-text contrastive strategy, in which the text representations act as prototypes for transforming vision features into a language space, which are then used as targets for semantically meaningful feature prediction using our masked encoder-decoder structure. FILS demonstrates remarkable transferability on downstream action recognition tasks, achieving state-of-the-art on challenging egocentric datasets, like Epic-Kitchens, Something-SomethingV2, Charades-Ego, and EGTEA, using ViT-Base. Our efficient method requires less computation and smaller batches compared to previous works.
- Published
- 2024
5. Geometric approaches to Lagrangian averaging
- Author
-
Gilbert, Andrew D. and Vanneste, Jacques
- Subjects
Physics - Fluid Dynamics ,Physics - Atmospheric and Oceanic Physics - Abstract
Lagrangian averaging theories, most notably the Generalised Lagrangian Mean (GLM) theory of Andrews & McIntyre (1978), have been primarily developed in Euclidean space and Cartesian coordinates. We re-interpret these theories using a geometric, coordinate-free formulation. This gives central roles to the flow map, its decomposition into mean and perturbation maps, and the momentum 1-form dual to the velocity vector. In this interpretation, the Lagrangian mean of any tensorial quantity is obtained by averaging its pull back to the mean configuration. Crucially, the mean velocity is not a Lagrangian mean in this sense. It can be defined in a variety of ways, leading to alternative Lagrangian mean formulations that include GLM and Soward & Roberts' (2010) glm. These formulations share key features which the geometric approach uncovers. We derive governing equations both for the mean flow and for wave activities constraining the dynamics of the pertubations. The presentation focusses on the Boussinesq model for inviscid rotating stratified flows and reviews the necessary tools of differential geometry., Comment: to be published in Annual Reviews of Fluid Mechanics (2025)
- Published
- 2024
6. PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
- Author
-
Fish, Edward, Weinbren, Jon, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
This paper introduces a novel approach to temporal action localization (TAL) in few-shot learning. Our work addresses the inherent limitations of conventional single-prompt learning methods that often lead to overfitting due to the inability to generalize across varying contexts in real-world videos. Recognizing the diversity of camera views, backgrounds, and objects in videos, we propose a multi-prompt learning framework enhanced with optimal transport. This design allows the model to learn a set of diverse prompts for each action, capturing general characteristics more effectively and distributing the representation to mitigate the risk of overfitting. Furthermore, by employing optimal transport theory, we efficiently align these prompts with action features, optimizing for a comprehensive representation that adapts to the multifaceted nature of video data. Our experiments demonstrate significant improvements in action localization accuracy and robustness in few-shot settings on the standard challenging datasets of THUMOS-14 and EpicKitchens100, highlighting the efficacy of our multi-prompt optimal transport approach in overcoming the challenges of conventional few-shot TAL methods., Comment: Under Review
- Published
- 2024
7. A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN
- Author
-
Tiago, Cristiana, Gilbert, Andrew, Beela, Ahmed S., Aase, Svein Arne, Snare, Sten Roar, and Sprem, Jurica
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Due to privacy issues and limited amount of publicly available labeled datasets in the domain of medical imaging, we propose an image generation pipeline to synthesize 3D echocardiographic images with corresponding ground truth labels, to alleviate the need for data collection and for laborious and error-prone human labeling of images for subsequent Deep Learning (DL) tasks. The proposed method utilizes detailed anatomical segmentations of the heart as ground truth label sources. This initial dataset is combined with a second dataset made up of real 3D echocardiographic images to train a Generative Adversarial Network (GAN) to synthesize realistic 3D cardiovascular Ultrasound images paired with ground truth labels. To generate the synthetic 3D dataset, the trained GAN uses high resolution anatomical models from Computed Tomography (CT) as input. A qualitative analysis of the synthesized images showed that the main structures of the heart are well delineated and closely follow the labels obtained from the anatomical models. To assess the usability of these synthetic images for DL tasks, segmentation algorithms were trained to delineate the left ventricle, left atrium, and myocardium. A quantitative analysis of the 3D segmentations given by the models trained with the synthetic images indicated the potential use of this GAN approach to generate 3D synthetic data, use the data to train DL models for different clinical tasks, and therefore tackle the problem of scarcity of 3D labeled echocardiography datasets.
- Published
- 2024
- Full Text
- View/download PDF
8. On statistical zonostrophic instability and the effect of magnetic fields
- Author
-
Wang, Chen, Mason, Joanne, and Gilbert, Andrew D.
- Subjects
Physics - Fluid Dynamics - Abstract
Zonal flows are mean flows in the east-west direction, which are ubiquitous on planets, and can be formed through 'zonostrophic instability': within turbulence or random waves, a weak large-scale zonal flow can grow exponentially to become prominent. In this paper, we study the statistical behaviour of the zonostrophic instability and the effect of magnetic fields. We use a stochastic white noise forcing to drive random waves, and study the growth of a mean flow in this random system. The dispersion relation for the growth rate of the expectation of the mean flow is derived, and properties of the instability are discussed. In the limits of weak and strong magnetic diffusivity, the dispersion relation reduces to manageable expressions, which provide clear insights into the effect of the magnetic field and scaling laws for the threshold of instability. The magnetic field mainly plays a stabilising role and thus impedes the formation of the zonal flow, but under certain conditions it can also have destabilising effects. Numerical simulation of the stochastic flow is performed to confirm the theory. Results indicate that the magnetic field can significantly increase the randomness of the zonal flow. It is found that the zonal flow of an individual realisation may behave very differently from the expectation. For weak magnetic diffusivity and moderate magnetic field strengths, this leads to considerable variation of the outcome, that is whether zonostrophic instability takes place or not in individual realisations.
- Published
- 2023
9. ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet
- Author
-
Cheong, Soon Yau, Mustafa, Armin, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
This paper introduces ViscoNet, a novel one-branch-adapter architecture for concurrent spatial and visual conditioning. Our lightweight model requires trainable parameters and dataset size multiple orders of magnitude smaller than the current state-of-the-art IP-Adapter. However, our method successfully preserves the generative power of the frozen text-to-image (T2I) backbone. Notably, it excels in addressing mode collapse, a pervasive issue previously overlooked. Our novel architecture demonstrates outstanding capabilities in achieving a harmonious visual-text balance, unlocking unparalleled versatility in various human image generation tasks, including pose re-targeting, virtual try-on, stylization, person re-identification, and textile transfer.Demo and code are available from project page https://soon-yau.github.io/visconet/ .
- Published
- 2023
10. Multi-Resolution Audio-Visual Feature Fusion for Temporal Action Localization
- Author
-
Fish, Edward, Weinbren, Jon, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
Temporal Action Localization (TAL) aims to identify actions' start, end, and class labels in untrimmed videos. While recent advancements using transformer networks and Feature Pyramid Networks (FPN) have enhanced visual feature recognition in TAL tasks, less progress has been made in the integration of audio features into such frameworks. This paper introduces the Multi-Resolution Audio-Visual Feature Fusion (MRAV-FF), an innovative method to merge audio-visual data across different temporal resolutions. Central to our approach is a hierarchical gated cross-attention mechanism, which discerningly weighs the importance of audio information at diverse temporal scales. Such a technique not only refines the precision of regression boundaries but also bolsters classification confidence. Importantly, MRAV-FF is versatile, making it compatible with existing FPN TAL architectures and offering a significant enhancement in performance when audio data is available., Comment: Under Review
- Published
- 2023
11. DECORAIT -- DECentralized Opt-in/out Registry for AI Training
- Author
-
Balan, Kar, Black, Alex, Jenni, Simon, Gilbert, Andrew, Parsons, Andy, and Collomosse, John
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
We present DECORAIT; a decentralized registry through which content creators may assert their right to opt in or out of AI training as well as receive reward for their contributions. Generative AI (GenAI) enables images to be synthesized using AI models trained on vast amounts of data scraped from public sources. Model and content creators who may wish to share their work openly without sanctioning its use for training are thus presented with a data governance challenge. Further, establishing the provenance of GenAI training data is important to creatives to ensure fair recognition and reward for their such use. We report a prototype of DECORAIT, which explores hierarchical clustering and a combination of on/off-chain storage to create a scalable decentralized registry to trace the provenance of GenAI training data in order to determine training consent and reward creatives who contribute that data. DECORAIT combines distributed ledger technology (DLT) with visual fingerprinting, leveraging the emerging C2PA (Coalition for Content Provenance and Authenticity) standard to create a secure, open registry through which creatives may express consent and data ownership for GenAI., Comment: Proc. of the 20th ACM SIGGRAPH European Conference on Visual Media Production
- Published
- 2023
12. Exploring the Development of Preservice Teachers’ Visions of Equity through Science and Mathematics Integration
- Author
-
Gilbert, Andrew, Suh, Jennifer, and Choudhry, Fahima
- Published
- 2024
- Full Text
- View/download PDF
13. MOFO: MOtion FOcused Self-Supervision for Video Understanding
- Author
-
Ahmadian, Mona, Guerin, Frank, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Self-supervised learning (SSL) techniques have recently produced outstanding results in learning visual representations from unlabeled videos. Despite the importance of motion in supervised learning techniques for action recognition, SSL methods often do not explicitly consider motion information in videos. To address this issue, we propose MOFO (MOtion FOcused), a novel SSL method for focusing representation learning on the motion area of a video, for action recognition. MOFO automatically detects motion areas in videos and uses these to guide the self-supervision task. We use a masked autoencoder which randomly masks out a high proportion of the input sequence; we force a specified percentage of the inside of the motion area to be masked and the remainder from outside. We further incorporate motion information into the finetuning step to emphasise motion in the downstream task. We demonstrate that our motion-focused innovations can significantly boost the performance of the currently leading SSL method (VideoMAE) for action recognition. Our method improves the recent self-supervised Vision Transformer (ViT), VideoMAE, by achieving +2.6%, +2.1%, +1.3% accuracy on Epic-Kitchens verb, noun and action classification, respectively, and +4.7% accuracy on Something-Something V2 action classification. Our proposed approach significantly improves the performance of the current SSL method for action recognition, indicating the importance of explicitly encoding motion in SSL., Comment: Accepted at the NeurIPS 2023 Workshop: Self-Supervised Learning - Theory and Practice
- Published
- 2023
14. DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer
- Author
-
Ruta, Dan, Tarrés, Gemma Canet, Gilbert, Andrew, Shechtman, Eli, Kolkin, Nicholas, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Neural Style Transfer (NST) is the field of study applying neural techniques to modify the artistic appearance of a content image to match the style of a reference style image. Traditionally, NST methods have focused on texture-based image edits, affecting mostly low level information and keeping most image structures the same. However, style-based deformation of the content is desirable for some styles, especially in cases where the style is abstract or the primary concept of the style is in its deformed rendition of some content. With the recent introduction of diffusion models, such as Stable Diffusion, we can access far more powerful image generation techniques, enabling new possibilities. In our work, we propose using this new class of models to perform style transfer while enabling deformable style transfer, an elusive capability in previous models. We show how leveraging the priors of these models can expose new artistic controls at inference time, and we document our findings in exploring this new direction for the field of style transfer.
- Published
- 2023
15. UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
- Author
-
Cheong, Soon Yau, Mustafa, Armin, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Text-to-image models (T2I) such as StableDiffusion have been used to generate high quality images of people. However, due to the random nature of the generation process, the person has a different appearance e.g. pose, face, and clothing, despite using the same text prompt. The appearance inconsistency makes T2I unsuitable for pose transfer. We address this by proposing a multimodal diffusion model that accepts text, pose, and visual prompting. Our model is the first unified method to perform all person image tasks - generation, pose transfer, and mask-less edit. We also pioneer using small dimensional 3D body model parameters directly to demonstrate new capability - simultaneous pose and camera view interpolation while maintaining the person's appearance.
- Published
- 2023
16. ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer
- Author
-
Ruta, Dan, Tarres, Gemma Canet, Black, Alexander, Gilbert, Andrew, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of a given sample respective to its domain. Existing works in visual style representation literature have tried to disentangle style from content during training explicitly. A complete separation between these has yet to be fully achieved. Our paper aims to learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image. We use Neural Style Transfer (NST) to measure and drive the learning signal and achieve state-of-the-art representation learning on explicitly disentangled metrics. We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics, encoding far less semantic information and achieving state-of-the-art accuracy in downstream multimodal applications.
- Published
- 2023
17. NeAT: Neural Artistic Tracing for Beautiful Style Transfer
- Author
-
Ruta, Dan, Gilbert, Andrew, Collomosse, John, Shechtman, Eli, and Kolkin, Nicholas
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Style transfer is the task of reproducing the semantic contents of a source image in the artistic style of a second target image. In this paper, we present NeAT, a new state-of-the art feed-forward style transfer method. We re-formulate feed-forward style transfer as image editing, rather than image generation, resulting in a model which improves over the state-of-the-art in both preserving the source content and matching the target style. An important component of our model's success is identifying and fixing "style halos", a commonly occurring artefact across many style transfer techniques. In addition to training and testing on standard datasets, we introduce the BBST-4M dataset, a new, large scale, high resolution dataset of 4M images. As a component of curating this data, we present a novel model able to classify if an image is stylistic. We use BBST-4M to improve and measure the generalization of NeAT across a huge variety of styles. Not only does NeAT offer state-of-the-art quality and generalization, it is designed and trained for fast inference at high resolution.
- Published
- 2023
18. EKILA: Synthetic Media Provenance and Attribution for Generative Art
- Author
-
Balan, Kar, Agarwal, Shruti, Jenni, Simon, Parsons, Andy, Gilbert, Andrew, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI). EKILA proposes a robust visual attribution technique and combines this with an emerging content provenance standard (C2PA) to address the problem of synthetic image provenance -- determining the generative model and training data responsible for an AI-generated image. Furthermore, EKILA extends the non-fungible token (NFT) ecosystem to introduce a tokenized representation for rights, enabling a triangular relationship between the asset's Ownership, Rights, and Attribution (ORA). Leveraging the ORA relationship enables creators to express agency over training consent and, through our attribution model, to receive apportioned credit, including royalty payments for the use of their assets in GenAI., Comment: Proc. CVPR Workshop on Media Forensics 2023
- Published
- 2023
19. Zonostrophic instabilities in magnetohydrodynamic Kolmogorov flow
- Author
-
Algatheem, Azza M, Gilbert, Andrew D, and Hillier, Andrew S
- Subjects
Physics - Fluid Dynamics - Abstract
This paper concerns the stability of Kolmogorov flow u = (0, sin x) in the infinite (x,y)-plane. A mean magnetic field of strength B0 is introduced and the MHD linear stability problem studied for modes with wave-number k in the y-direction, and Bloch wavenumber l in the x-direction. The parameters governing the problem are Reynolds number 1/nu, magnetic Prandtl number P, and dimensionless magnetic field strength B0. The mean magnetic field can be taken to have an arbitrary direction in the (x,y)-plane and a mean x-directed flow U0 can be incorporated. First the paper considers Kolmogorov flow with y-directed mean magnetic field, referred to as vertical. Taking l=0, the suppression of the pure hydrodynamic instability is observed with increasing field strength B0. A branch of strong-field instabilities occurs for magnetic Prandtl number P less than unity, as found by A.E. Fraser, I.G. Cresser and P. Garaud (J. Fluid Mech. 949, A43, 2022). Analytical results using eigenvalue perturbation theory in the limit k->0 support the numerics for both weak- and strong-field instabilities, and originate in the coupling of large-scale modes with x-wavenumber n=0, to smaller-scale modes. The paper considers the case of horizontal or x-directed mean magnetic field. The unperturbed state consists of steady, wavey magnetic field lines. As the magnetic field is increased, the purely hydrodynamic instability is suppressed again, but for stronger fields a new branch of instabilities appears. Allowing a non-zero Bloch wavenumber l allows further instability, and in some circumstances when the system is hydrodynamically stable, arbitrarily weak magnetic fields can give growing modes. Numerical results are presented together with eigenvalue perturbation theory in the limits k,l->0. The theory gives analytical approximations for growth rates and thresholds in good agreement with those computed., Comment: 29 pages, 11 figures
- Published
- 2023
20. SVS: Adversarial refinement for sparse novel view synthesis
- Author
-
González, Violeta Menéndez, Gilbert, Andrew, Phillipson, Graeme, Jolly, Stephen, and Hadfield, Simon
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Graphics ,Computer Science - Machine Learning - Abstract
This paper proposes Sparse View Synthesis. This is a view synthesis problem where the number of reference views is limited, and the baseline between target and reference view is significant. Under these conditions, current radiance field methods fail catastrophically due to inescapable artifacts such 3D floating blobs, blurring and structural duplication, whenever the number of reference views is limited, or the target view diverges significantly from the reference views. Advances in network architecture and loss regularisation are unable to satisfactorily remove these artifacts. The occlusions within the scene ensure that the true contents of these regions is simply not available to the model. In this work, we instead focus on hallucinating plausible scene contents within such regions. To this end we unify radiance field models with adversarial learning and perceptual losses. The resulting system provides up to 60% improvement in perceptual accuracy compared to current state-of-the-art radiance field models on this problem., Comment: BMVC 2022
- Published
- 2022
21. An analytical study of the MHD clamshell instability on a sphere
- Author
-
Wang, Chen, Gilbert, Andrew D., and Mason, Joanne
- Subjects
Physics - Fluid Dynamics ,Astrophysics - Solar and Stellar Astrophysics ,Physics - Plasma Physics - Abstract
This paper studies the instability of two-dimensional magnetohydrodynamic (MHD) systems on a sphere using analytical methods. The underlying flow consists of a zonal differential rotation and a toroidal magnetic field is present. Semicircle rules that prescribe the possible domain of the wave velocity in the complex plane for general flow and field profiles are derived. The paper then sets out an analytical study of the `clamshell instability', which features field lines on the two hemispheres tilting in opposite directions (Cally 2001, Sol. Phys. vol. 199, pp. 231--249). An asymptotic solution for the instability problem is derived for the limit of weak shear of the zonal flow, via the method of matched asymptotic expansions. It is shown that when the zonal flow is solid body rotation, there exists a neutral mode that tilts the magnetic field lines, referred to as the `tilting mode'. A weak shear of the zonal flow excites the critical layer of the tilting mode, which reverses the tilting direction to form the clamshell pattern and induces the instability. The asymptotic solution provides insights into properties of the instability for a range of flow and field profiles. A remarkable feature is that the magnetic field affects the instability only through its local behaviour in the critical layer.
- Published
- 2022
- Full Text
- View/download PDF
22. HyperNST: Hyper-Networks for Neural Style Transfer
- Author
-
Ruta, Dan, Gilbert, Andrew, Motiian, Saeid, Faieta, Baldo, Lin, Zhe, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture. Our contribution is a novel method for inducing style transfer parameterized by a metric space, pre-trained for style-based visual search (SBVS). We show for the first time that such space may be used to drive NST, enabling the application and interpolation of styles from an SBVS system. The technical contribution is a hyper-network that predicts weight updates to a StyleGAN2 pre-trained over a diverse gamut of artistic content (portraits), tailoring the style parameterization on a per-region basis using a semantic map of the facial regions. We show HyperNST to exceed state of the art in content preservation for our stylized content while retaining good style transfer performance.
- Published
- 2022
23. Two-Stream Transformer Architecture for Long Video Understanding
- Author
-
Fish, Edward, Weinbren, Jon, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
Pure vision transformer architectures are highly effective for short video classification and action recognition tasks. However, due to the quadratic complexity of self attention and lack of inductive bias, transformers are resource intensive and suffer from data inefficiencies. Long form video understanding tasks amplify data and memory efficiency problems in transformers making current approaches unfeasible to implement on data or memory restricted domains. This paper introduces an efficient Spatio-Temporal Attention Network (STAN) which uses a two-stream transformer architecture to model dependencies between static image features and temporal contextual features. Our proposed approach can classify videos up to two minutes in length on a single GPU, is data efficient, and achieves SOTA performance on several long video understanding tasks.
- Published
- 2022
24. Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac ultrasound
- Author
-
Thomas, Sarina, Gilbert, Andrew, and Ben-Yosef, Guy
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Accurate and consistent predictions of echocardiography parameters are important for cardiovascular diagnosis and treatment. In particular, segmentations of the left ventricle can be used to derive ventricular volume, ejection fraction (EF) and other relevant measurements. In this paper we propose a new automated method called EchoGraphs for predicting ejection fraction and segmenting the left ventricle by detecting anatomical keypoints. Models for direct coordinate regression based on Graph Convolutional Networks (GCNs) are used to detect the keypoints. GCNs can learn to represent the cardiac shape based on local appearance of each keypoint, as well as global spatial and temporal structures of all keypoints combined. We evaluate our EchoGraphs model on the EchoNet benchmark dataset. Compared to semantic segmentation, GCNs show accurate segmentation and improvements in robustness and inference runtime. EF is computed simultaneously to segmentations and our method also obtains state-of-the-art ejection fraction estimation. Source code is available online: https://github.com/guybenyosef/EchoGraphs., Comment: Accepted to MICCAI 2022
- Published
- 2022
25. Face Validity of Four Preference-Weighted Quality-of-Life Measures in Residential Aged Care: A Think-Aloud Study
- Author
-
Engel, Lidia, Kosowicz, Leona, Bogatyreva, Ekaterina, Batchelor, Frances, Devlin, Nancy, Dow, Briony, Gilbert, Andrew S., Mulhern, Brendan, Peasgood, Tessa, and Viney, Rosalie
- Published
- 2023
- Full Text
- View/download PDF
26. SaiNet: Stereo aware inpainting behind objects with generative networks
- Author
-
González, Violeta Menéndez, Gilbert, Andrew, Phillipson, Graeme, Jolly, Stephen, and Hadfield, Simon
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Graphics ,Computer Science - Machine Learning - Abstract
In this work, we present an end-to-end network for stereo-consistent image inpainting with the objective of inpainting large missing regions behind objects. The proposed model consists of an edge-guided UNet-like network using Partial Convolutions. We enforce multi-view stereo consistency by introducing a disparity loss. More importantly, we develop a training scheme where the model is learned from realistic stereo masks representing object occlusions, instead of the more common random masks. The technique is trained in a supervised way. Our evaluation shows competitive results compared to previous state-of-the-art techniques., Comment: Presented at AI4CC workshop at CVPR
- Published
- 2022
27. StyleBabel: Artistic Style Tagging and Captioning
- Author
-
Ruta, Dan, Gilbert, Andrew, Aggarwal, Pranav, Marri, Naveen, Kale, Ajinkya, Briggs, Jo, Speed, Chris, Jin, Hailin, Faieta, Baldo, Filipkowski, Alex, Lin, Zhe, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language - Abstract
We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools. StyleBabel was collected via an iterative method, inspired by `Grounded Theory': a qualitative approach that enables annotation while co-evolving a shared language for fine-grained artistic style attribute description. We demonstrate several downstream tasks for StyleBabel, adapting the recent ALADIN architecture for fine-grained style similarity, to train cross-modal embeddings for: 1) free-form tag generation; 2) natural language description of artistic style; 3) fine-grained text search of style. To do so, we extend ALADIN with recent advances in Visual Transformer (ViT) and cross-modal representation learning, achieving a state of the art accuracy in fine-grained style retrieval.
- Published
- 2022
28. KPE: Keypoint Pose Encoding for Transformer-based Image Generation
- Author
-
Cheong, Soon Yau, Mustafa, Armin, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Transformers have recently been shown to generate high quality images from text input. However, the existing method of pose conditioning using skeleton image tokens is computationally inefficient and generate low quality images. Therefore we propose a new method; Keypoint Pose Encoding (KPE); KPE is 10 times more memory efficient and over 73% faster at generating high quality images from text input conditioned on the pose. The pose constraint improves the image quality and reduces errors on body extremities such as arms and legs. The additional benefits include invariance to changes in the target image domain and image resolution, making it easily scalable to higher resolution images. We demonstrate the versatility of KPE by generating photorealistic multiperson images derived from the DeepFashion dataset. We also introduce a evaluation method People Count Error (PCE) that is effective in detecting error in generated human images.
- Published
- 2022
29. Deep learning for automated left ventricular outflow tract diameter measurements in 2D echocardiography
- Author
-
Zha, Sigurd Zijun, Rogstadkjernet, Magnus, Klæboe, Lars Gunnar, Skulstad, Helge, Singstad, Bjørn-Jostein, Gilbert, Andrew, Edvardsen, Thor, Samset, Eigil, and Brekke, Pål Haugar
- Published
- 2023
- Full Text
- View/download PDF
30. Critical-layer instability of shallow water magnetohydrodynamic shear flows
- Author
-
Wang, Chen, Gilbert, Andrew, and Mason, Joanne
- Subjects
Physics - Fluid Dynamics - Abstract
In this paper, the instability of shallow water shear flow with a sheared parallel magnetic field is studied. Waves propagating in such magnetic shear flows encounter critical levels where the phase velocity relative to the basic flow $c-U(y)$ matches the Alfv\'en wave velocities $\pm B(y)/\sqrt{\mu\rho}$, based on the local magnetic field $B(y)$, the magnetic permeability $\mu$ and the mass density of the fluid $\rho$. It is shown that when the two critical levels are close to each other, the critical layer can generate an instability. The instability problem is solved, combining asymptotic solutions at large wavenumbers and numerical solutions, and the mechanism of instability explained using the conservation of momentum. For the shallow water MHD system, the paper gives the general form of the local differential equation governing such coalescing critical layers for any generic field and flow profiles, and determines precisely how the magnetic field modifies the purely hydrodynamic stability criterion based on the potential vorticity gradient in the critical layer. The curvature of the magnetic field profile, or equivalently the electric current gradient, $J' = - B''/\mu$ in the critical layer is found to play a complementary role in the instability.
- Published
- 2022
- Full Text
- View/download PDF
31. 'I Realized That Science Isn't Scary': In-Service Teacher Insights Regarding Science Focused Partnerships
- Author
-
Gilbert, Andrew, Hobbs, Linda, Kenny, John, Jones, Mellita, Campbell, Coral, Chittleborough, Gail, Herbert, Sandra, and Redman, Christine
- Abstract
In primary science education, we face an ongoing concern of helping classroom teachers overcome negative associations with science content, teaching and learning. These associations can often impact how they view the value of science in their classroom teaching and impede the development of innovative teaching practice. This research effort investigated in-service teachers' perceptions, reflections and considerations that resulted from their direct involvement within a science-focused school university partnership. Utilizing a multiple case study design, this research effort analyzed partnership efforts across five established science-focused partnership programs in the Australian states of Victoria and Tasmania. Analysis of interview data with 80 in-service teachers from across partner sites indicated an increased valuing of science, where teachers viewed working with pre-service teachers as a professional development opportunity, resulting in additional time spent on developing and teaching through inquiry-based science practices.
- Published
- 2020
32. Successful Online Learning: What Does Learner Interaction with Peers, Instructors and Parents Look Like?
- Author
-
Keaton, Whitney and Gilbert, Andrew
- Abstract
The student perspective in research of K-12 online and STEM education is largely absent but is important for understanding how both of these areas can come together to best serve students. This study used teacher ratings, school data and student interviews to investigate the perceptions students in online STEM courses have of their past and current educational experiences. Using an adaptation of Moore's Framework of Interactions (Moore, 1989), the academic and extracurricular behaviors of these students were examined in relation to their interactions with others, specifically instructors, parents, and peers. It was found that the interactions that students have with these stakeholders are different in this setting as compared to a traditional learning environment. Teachers in online schools serve the role of a facilitator that students felt was important to their success, but was not their only source of instruction. Parents took on many roles in this setting, including monitoring, motivating, instructing and organizing. Learner-learner interaction looked the most different compared to traditional schools because these participants generally had little interaction with peers due to time and distance constraints. Implications of these findings for students, schools, education, and research are given.
- Published
- 2020
33. Human-like Relational Models for Activity Recognition in Video
- Author
-
Chrol-Cannon, Joseph, Gilbert, Andrew, Lazic, Ranko, Madhusoodanan, Adithya, and Guerin, Frank
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video activity recognition by deep neural networks is impressive for many classes. However, it falls short of human performance, especially for challenging to discriminate activities. Humans differentiate these complex activities by recognising critical spatio-temporal relations among explicitly recognised objects and parts, for example, an object entering the aperture of a container. Deep neural networks can struggle to learn such critical relationships effectively. Therefore we propose a more human-like approach to activity recognition, which interprets a video in sequential temporal phases and extracts specific relationships among objects and hands in those phases. Random forest classifiers are learnt from these extracted relationships. We apply the method to a challenging subset of the something-something dataset and achieve a more robust performance against neural network baselines on challenging activities.
- Published
- 2021
34. Neutron resonance transmission analysis prototype system for thorium fuel cycle safeguards
- Author
-
McDonald, Benjamin S., Danagoulian, Areg, Gilbert, Andrew J., Klein, Ethan A., Kulisek, Jonathan A., Moore, Michael E., Rahon, Jill M., and Zalavadia, Mital A.
- Published
- 2024
- Full Text
- View/download PDF
35. ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity
- Author
-
Ruta, Dan, Motiian, Saeid, Faieta, Baldo, Lin, Zhe, Jin, Hailin, Filipkowski, Alex, Gilbert, Andrew, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style. Representation learning is critical to visual search, where distance in the learned search embedding reflects image similarity. Learning an embedding that discriminates fine-grained variations in style is hard, due to the difficulty of defining and labelling style. ALADIN takes a weakly supervised approach to learning a representation for fine-grained style similarity of digital artworks, leveraging BAM-FG, a novel large-scale dataset of user generated content groupings gathered from the web. ALADIN sets a new state of the art accuracy for style-based visual search over both coarse labelled style data (BAM) and BAM-FG; a new 2.62 million image dataset of 310,000 fine-grained style groupings also contributed by this work.
- Published
- 2021
36. Case: But Miss, There are Six Oceans, Not Five?
- Author
-
Gilbert, Andrew, Erickson, Valery, Jeong, Sophia, editor, Bryan, Lynn A., editor, Tippins, Deborah J., editor, and Sexton, Chelsea M., editor
- Published
- 2023
- Full Text
- View/download PDF
37. HyperNST: Hyper-Networks for Neural Style Transfer
- Author
-
Ruta, Dan, Gilbert, Andrew, Motiian, Saeid, Faieta, Baldo, Lin, Zhe, Collomosse, John, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlinsky, Leonid, editor, Michaeli, Tomer, editor, and Nishino, Ko, editor
- Published
- 2023
- Full Text
- View/download PDF
38. Rethinking movie genre classification with fine-grained semantic clustering
- Author
-
Fish, Edward, Weinbren, Jon, and Gilbert, Andrew
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Information Retrieval ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
Movie genre classification is an active research area in machine learning. However, due to the limited labels available, there can be large semantic variations between movies within a single genre definition. We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information within the multi-modal content of movies. By leveraging pre-trained 'expert' networks, we learn the influence of different combinations of modes for multi-label genre classification. Using a contrastive loss, we continue to fine-tune this 'coarse' genre classification network to identify high-level intertextual similarities between the movies across all genre labels. This leads to a more 'fine-grained' and detailed clustering, based on semantic similarities while still retaining some genre information. Our approach is demonstrated on a newly introduced multi-modal 37,866,450 frame, 8,800 movie trailer dataset, MMX-Trailer-20, which includes pre-computed audio, location, motion, and image embeddings.
- Published
- 2020
39. “Keeping our distance”: Older adults' experiences during year one of the COVID-19 pandemic and lockdown in Australia
- Author
-
Gilbert, Andrew S., Garratt, Stephanie M., Brijnath, Bianca, Ostaszkiewicz, Joan, Batchelor, Frances, Dang, Christa, Dow, Briony, and Goh, Anita M.Y.
- Published
- 2023
- Full Text
- View/download PDF
40. Software for the frontiers of quantum chemistry: An overview of developments in the Q-Chem 5 package
- Author
-
Epifanovsky, Evgeny, Gilbert, Andrew TB, Feng, Xintian, Lee, Joonho, Mao, Yuezhi, Mardirossian, Narbe, Pokhilko, Pavel, White, Alec F, Coons, Marc P, Dempwolff, Adrian L, Gan, Zhengting, Hait, Diptarka, Horn, Paul R, Jacobson, Leif D, Kaliman, Ilya, Kussmann, Jörg, Lange, Adrian W, Lao, Ka Un, Levine, Daniel S, Liu, Jie, McKenzie, Simon C, Morrison, Adrian F, Nanda, Kaushik D, Plasser, Felix, Rehn, Dirk R, Vidal, Marta L, You, Zhi-Qiang, Zhu, Ying, Alam, Bushra, Albrecht, Benjamin J, Aldossary, Abdulrahman, Alguire, Ethan, Andersen, Josefine H, Athavale, Vishikh, Barton, Dennis, Begam, Khadiza, Behn, Andrew, Bellonzi, Nicole, Bernard, Yves A, Berquist, Eric J, Burton, Hugh GA, Carreras, Abel, Carter-Fenk, Kevin, Chakraborty, Romit, Chien, Alan D, Closser, Kristina D, Cofer-Shabica, Vale, Dasgupta, Saswata, de Wergifosse, Marc, Deng, Jia, Diedenhofen, Michael, Do, Hainam, Ehlert, Sebastian, Fang, Po-Tung, Fatehi, Shervin, Feng, Qingguo, Friedhoff, Triet, Gayvert, James, Ge, Qinghui, Gidofalvi, Gergely, Goldey, Matthew, Gomes, Joe, González-Espinoza, Cristina E, Gulania, Sahil, Gunina, Anastasia O, Hanson-Heine, Magnus WD, Harbach, Phillip HP, Hauser, Andreas, Herbst, Michael F, Vera, Mario Hernández, Hodecker, Manuel, Holden, Zachary C, Houck, Shannon, Huang, Xunkun, Hui, Kerwin, Huynh, Bang C, Ivanov, Maxim, Jász, Ádám, Ji, Hyunjun, Jiang, Hanjie, Kaduk, Benjamin, Kähler, Sven, Khistyaev, Kirill, Kim, Jaehoon, Kis, Gergely, Klunzinger, Phil, Koczor-Benda, Zsuzsanna, Koh, Joong Hoon, Kosenkov, Dimitri, Koulias, Laura, Kowalczyk, Tim, Krauter, Caroline M, Kue, Karl, Kunitsa, Alexander, Kus, Thomas, Ladjánszki, István, Landau, Arie, Lawler, Keith V, Lefrancois, Daniel, and Lehtola, Susi
- Subjects
Physical Sciences ,Chemical Sciences ,Atomic ,Molecular and Optical Physics ,Physical Chemistry ,Engineering ,Chemical Physics ,Chemical sciences ,Physical sciences - Abstract
This article summarizes technical advances contained in the fifth major release of the Q-Chem quantum chemistry program package, covering developments since 2015. A comprehensive library of exchange-correlation functionals, along with a suite of correlated many-body methods, continues to be a hallmark of the Q-Chem software. The many-body methods include novel variants of both coupled-cluster and configuration-interaction approaches along with methods based on the algebraic diagrammatic construction and variational reduced density-matrix methods. Methods highlighted in Q-Chem 5 include a suite of tools for modeling core-level spectroscopy, methods for describing metastable resonances, methods for computing vibronic spectra, the nuclear-electronic orbital method, and several different energy decomposition analysis techniques. High-performance capabilities including multithreaded parallelism and support for calculations on graphics processing units are described. Q-Chem boasts a community of well over 100 active academic developers, and the continuing evolution of the software is supported by an "open teamware" model and an increasingly modular design.
- Published
- 2021
41. Neural Architecture Search for Deep Image Prior
- Author
-
Ho, Kary, Gilbert, Andrew, Jin, Hailin, and Collomosse, John
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present a neural architecture search (NAS) technique to enhance the performance of unsupervised image de-noising, in-painting and super-resolution under the recently proposed Deep Image Prior (DIP). We show that evolutionary search can automatically optimize the encoder-decoder (E-D) structure and meta-parameters of the DIP network, which serves as a content-specific prior to regularize these single image restoration tasks. Our binary representation encodes the design space for an asymmetric E-D network that typically converges to yield a content-specific DIP within 10-20 generations using a population size of 500. The optimized architectures consistently improve upon the visual quality of classical DIP for a diverse range of photographic and artistic content.
- Published
- 2020
42. A geometric look at momentum flux and stress in fluid mechanics
- Author
-
Gilbert, Andrew D. and Vanneste, Jacques
- Subjects
Physics - Fluid Dynamics - Abstract
We develop a geometric formulation of fluid dynamics, valid on arbitrary Riemannian manifolds, that regards the momentum-flux and stress tensors as 1-form valued 2-forms, and their divergence as a covariant exterior derivative. We review the necessary tools of differential geometry and obtain the corresponding coordinate-free form of the equations of motion for a variety of inviscid fluid models -- compressible and incompressible Euler equations, Lagrangian-averaged Euler-$\alpha$ equations, magnetohydrodynamics and shallow-water models -- using a variational derivation which automatically yields a symmetric momentum flux. We also consider dissipative effects and discuss the geometric form of the Navier--Stokes equations for viscous fluids and of the Oldroyd-B model for visco-elastic fluids.
- Published
- 2019
43. A geometric look at MHD and the Braginsky dynamo
- Author
-
Gilbert, Andrew D. and Vanneste, Jacques
- Subjects
Physics - Fluid Dynamics - Abstract
This paper considers magnetohydrodynamics (MHD) and some of its applications from the perspective of differential geometry, considering the dynamics of an ideal fluid flow and magnetic field on a general three-dimensional manifold, equipped with a metric and an induced volume form. The benefit of this level of abstraction is that it clarifies basic aspects of fluid dynamics such as how certain quantities are transported, how they transform under the action of mappings (for example the flow map between Lagrangian labels and Eulerian positions), how conservation laws arise, and the origin of certain approximations that preserve the mathematical structure of classical mechanics. First, the governing equations for ideal MHD are derived in a general setting by means of an action principle, and making use of Lie derivatives. The way in which these equations transform under a pull back, by the map taking the position of a fluid parcel to a background location, is detailed. This is then used to parameterise Alfv\'en waves using concepts of pseudomomentum and pseudofield, in parallel with the development of Generalised Lagrangian Mean theory in hydrodynamics. Finally non-ideal MHD is considered with a sketch of the development of the Braginsky $\alpha\omega$-dynamo in a general setting. Expressions for the $\alpha$-tensor are obtained, including a novel geometric formulation in terms of connection coefficients, and related to formulae found elsewhere in the literature.
- Published
- 2019
- Full Text
- View/download PDF
44. A filamentary cascade model of the inertial range
- Author
-
Childress, Stephen and Gilbert, Andrew G.
- Subjects
Physics - Fluid Dynamics - Abstract
This paper develops a simple model of the inertial range of turbulent flow, based on a cascade of vortical filaments. A binary branching structure is proposed, involving the splitting of filaments at each step into pairs of daughter filaments with differing properties, in effect two distinct simultaneous cascades. Neither of these cascades has the Richardson-Kolmogorov exponent of 1/3. This bimodal structure is also different from bifractal models as vorticity volume is conserved. If cascades are assumed to be initiated continuously and throughout space we obtain a model of the inertial range of stationary turbulence. We impose the constraint associated with Kolmogorov's four-fifths law and then adjust the splitting to achieve good agreement with the observed structure exponents $\zeta_p$. The presence of two elements to the cascade is responsible for the nonlinear dependence of $\zeta_p$ upon $p$. A single cascade provides a model for the initial-value problem of the Navier--Stokes equations in the limit of vanishing viscosity. To simulate this limit we let the cascade continue indefinitely, energy removal occurring in the limit. We are thus able to compute the decay of energy in the model., Comment: 35 pages, 14 figures
- Published
- 2019
- Full Text
- View/download PDF
45. Automated Left Ventricle Dimension Measurement in 2D Cardiac Ultrasound via an Anatomically Meaningful CNN Approach
- Author
-
Gilbert, Andrew, Holden, Marit, Eikvil, Line, Aase, Svein Arne, Samset, Eigil, and McLeod, Kristin
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Two-dimensional echocardiography (2DE) measurements of left ventricle (LV) dimensions are highly significant markers of several cardiovascular diseases. These measurements are often used in clinical care despite suffering from large variability between observers. This variability is due to the challenging nature of accurately finding the correct temporal and spatial location of measurement endpoints in ultrasound images. These images often contain fuzzy boundaries and varying reflection patterns between frames. In this work, we present a convolutional neural network (CNN) based approach to automate 2DE LV measurements. Treating the problem as a landmark detection problem, we propose a modified U-Net CNN architecture to generate heatmaps of likely coordinate locations. To improve the network performance we use anatomically meaningful heatmaps as labels and train with a multi-component loss function. Our network achieves 13.4%, 6%, and 10.8% mean percent error on intraventricular septum (IVS), LV internal dimension (LVID), and LV posterior wall (LVPW) measurements respectively. The design outperforms other networks and matches or approaches intra-analyser expert error., Comment: Best paper award at Smart Ultrasound Imaging Workshop (SUSI) MICCAI 2019
- Published
- 2019
- Full Text
- View/download PDF
46. Doppler Spectrum Classification with CNNs via Heatmap Location Encoding and a Multi-head Output Layer
- Author
-
Gilbert, Andrew, Holden, Marit, Eikvil, Line, Rakhmail, Mariia, Babic, Aleksandar, Aase, Svein Arne, Samset, Eigil, and McLeod, Kristin
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Spectral Doppler measurements are an important part of the standard echocardiographic examination. These measurements give important insight into myocardial motion and blood flow providing clinicians with parameters for diagnostic decision making. Many of these measurements can currently be performed automatically with high accuracy, increasing the efficiency of the diagnostic pipeline. However, full automation is not yet available because the user must manually select which measurement should be performed on each image. In this work we develop a convolutional neural network (CNN) to automatically classify cardiac Doppler spectra into measurement classes. We show how the multi-modal information in each spectral Doppler recording can be combined using a meta parameter post-processing mapping scheme and heatmaps to encode coordinate locations. Additionally, we experiment with several state-of-the-art network architectures to examine the tradeoff between accuracy and memory usage for resource-constrained environments. Finally, we propose a confidence metric using the values in the last fully connected layer of the network. We analyze example images that fall outside of our proposed classes to show our confidence metric can prevent many misclassifications. Our algorithm achieves 96% accuracy on a test set drawn from a separate clinical site, indicating that the proposed method is suitable for clinical adoption and enabling a fully automatic pipeline from acquisition to Doppler spectrum measurements., Comment: copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
- Published
- 2019
47. The impact of elder abuse training on subacute health providers and older adults: study protocol for a randomized control trial
- Author
-
Cavuoto, Marina G., Markusevska, Simona, Stevens, Catriona, Reyes, Patricia, Renshaw, Gianna, Peters, Micah D. J., Dow, Briony, Feldman, Peter, Gilbert, Andrew, Manias, Elizabeth, Mortimer, Duncan, Enticott, Joanne, Cooper, Claudia, Antoniades, Josefine, Appleton, Brenda, Nakrem, Sigrid, O’Brien, Meghan, Ostaszkiewicz, Joan, Eckert, Marion, Durston, Cheryl, and Brijnath, Bianca
- Published
- 2024
- Full Text
- View/download PDF
48. The ENJOY Seniors Exercise Park IMP-ACT project: IMProving older people’s health through physical ACTivity: a hybrid II implementation design study protocol
- Author
-
Levinger, Pazit, Fearn, Marcia, Dreher, Bronwyn, Bauman, Adrian, Brusco, Natasha K., Gilbert, Andrew, Soh, Sze-Ee, Burton, Elissa, James, Lisa, and Hill, Keith D.
- Published
- 2024
- Full Text
- View/download PDF
49. Incorporating a polygenic risk score-triaged coronary calcium score into cardiovascular disease examinations to identify subclinical coronary artery disease (ESCALATE): Protocol for a prospective, nonrandomized implementation trial
- Author
-
Barlow, John E., Bauer, Denis, BradfordBerman, DanaYemima, Bottá, Giordano, Figtree, Gemma A., Gilbert, Andrew, Gray, Michael P., Grieve, Stuart M., Ho, Amy, Hu, Jessica, Hyun, Karice, Jennings, Garry, Kilov, Gary, Levesque, Jean-Frederic, Meikle, Peter, Nicholls, Stephen J., Redfern, Julie, Stavreski, Bill, Suthers, Graeme, Usherwood, Tim, Wilson, Andrew, Thackway, Stephen, Rogers, Caroline, Berman, Yemima, Bottà, Giordano, Ingles, Jodie, and Vernon, Stephen T.
- Published
- 2023
- Full Text
- View/download PDF
50. Impact of California Statute on Naloxone Availability and Opioid Overdose Rates
- Author
-
Gallant, Tara L., Gilbert, Andrew R., Zargham, Sina, Lorenzo, Michael F. Di, Puglisi, Jose L., Nicholas, Zachary R., and Gerriets, Valerie A.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.