1,108 results on '"Holmes, Chris"'
Search Results
2. On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
- Author
-
Wang, Ziyu and Holmes, Chris
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Applications of large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging. This is due to the need to identify task-specific uncertainties (e.g., about the semantics) which appears difficult to define in general cases. This work addresses these challenges from a perspective of Bayesian decision theory, starting from the assumption that our utility is characterized by a similarity measure that compares a generated response with a hypothetical true response. We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration. We further derive a measure for epistemic uncertainty, based on a missing data perspective and its characterization as an excess risk. The proposed measures can be applied to black-box language models. We demonstrate the proposed methods on question answering and machine translation tasks, where they extract broadly meaningful uncertainty estimates from GPT and Gemini models and quantify their calibration.
- Published
- 2024
3. Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
- Author
-
Falck, Fabian, Wang, Ziyu, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
In-context learning (ICL) has emerged as a particularly remarkable characteristic of Large Language Models (LLM): given a pretrained LLM and an observed dataset, LLMs can make predictions for new data points from the same distribution without fine-tuning. Numerous works have postulated ICL as approximately Bayesian inference, rendering this a natural hypothesis. In this work, we analyse this hypothesis from a new angle through the martingale property, a fundamental requirement of a Bayesian learning system for exchangeable data. We show that the martingale property is a necessary condition for unambiguous predictions in such scenarios, and enables a principled, decomposed notion of uncertainty vital in trustworthy, safety-critical systems. We derive actionable checks with corresponding theory and test statistics which must hold if the martingale property is satisfied. We also examine if uncertainty in LLMs decreases as expected in Bayesian learning when more data is observed. In three experiments, we provide evidence for violations of the martingale property, and deviations from a Bayesian scaling behaviour of uncertainty, falsifying the hypothesis that ICL is Bayesian., Comment: Accepted at International Conference on Machine Learning (ICML) 2024
- Published
- 2024
4. On Uncertainty Quantification for Near-Bayes Optimal Algorithms
- Author
-
Wang, Ziyu and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Bayesian modelling allows for the quantification of predictive uncertainty which is crucial in safety-critical applications. Yet for many machine learning (ML) algorithms, it is difficult to construct or implement their Bayesian counterpart. In this work we present a promising approach to address this challenge, based on the hypothesis that commonly used ML algorithms are efficient across a wide variety of tasks and may thus be near Bayes-optimal w.r.t. an unknown task distribution. We prove that it is possible to recover the Bayesian posterior defined by the task distribution, which is unknown but optimal in this setting, by building a martingale posterior using the algorithm. We further propose a practical uncertainty quantification method that apply to general ML algorithms. Experiments based on a variety of non-NN and NN algorithms demonstrate the efficacy of our method.
- Published
- 2024
5. Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection
- Author
-
Dauncey, Sam, Holmes, Chris, Williams, Christopher, and Falck, Fabian
- Subjects
Statistics - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Likelihood-based deep generative models such as score-based diffusion models and variational autoencoders are state-of-the-art machine learning models approximating high-dimensional distributions of data such as images, text, or audio. One of many downstream tasks they can be naturally applied to is out-of-distribution (OOD) detection. However, seminal work by Nalisnick et al. which we reproduce showed that deep generative models consistently infer higher log-likelihoods for OOD data than data they were trained on, marking an open problem. In this work, we analyse using the gradient of a data point with respect to the parameters of the deep generative model for OOD detection, based on the simple intuition that OOD data should have larger gradient norms than training data. We formalise measuring the size of the gradient as approximating the Fisher information metric. We show that the Fisher information matrix (FIM) has large absolute diagonal values, motivating the use of chi-square distributed, layer-wise gradient norms as features. We combine these features to make a simple, model-agnostic and hyperparameter-free method for OOD detection which estimates the joint density of the layer-wise gradient norms for a given data point. We find that these layer-wise gradient norms are weakly correlated, rendering their combined usage informative, and prove that the layer-wise gradient norms satisfy the principle of (data representation) invariance. Our empirical results indicate that this method outperforms the Typicality test for most deep generative models and image dataset pairings.
- Published
- 2024
6. Hierarchical Bias-Driven Stratification for Interpretable Causal Effect Estimation
- Author
-
Ter-Minassian, Lucile, Szlak, Liran, Karavani, Ehud, Holmes, Chris, and Shimoni, Yishai
- Subjects
Statistics - Methodology ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Interpretability and transparency are essential for incorporating causal effect models from observational data into policy decision-making. They can provide trust for the model in the absence of ground truth labels to evaluate the accuracy of such models. To date, attempts at transparent causal effect estimation consist of applying post hoc explanation methods to black-box models, which are not interpretable. Here, we present BICauseTree: an interpretable balancing method that identifies clusters where natural experiments occur locally. Our approach builds on decision trees with a customized objective function to improve balancing and reduce treatment allocation bias. Consequently, it can additionally detect subgroups presenting positivity violations, exclude them, and provide a covariate-based definition of the target population we can infer from and generalize to. We evaluate the method's performance using synthetic and realistic datasets, explore its bias-interpretability tradeoff, and show that it is comparable with existing approaches.
- Published
- 2024
7. Explainable AI for survival analysis: a median-SHAP approach
- Author
-
Ter-Minassian, Lucile, Ghalebikesabi, Sahra, Diaz-Ordaz, Karla, and Holmes, Chris
- Subjects
Computer Science - Machine Learning ,Statistics - Methodology ,Statistics - Machine Learning - Abstract
With the adoption of machine learning into routine clinical practice comes the need for Explainable AI methods tailored to medical applications. Shapley values have sparked wide interest for locally explaining models. Here, we demonstrate their interpretation strongly depends on both the summary statistic and the estimator for it, which in turn define what we identify as an 'anchor point'. We show that the convention of using a mean anchor point may generate misleading interpretations for survival analysis and introduce median-SHAP, a method for explaining black-box models predicting individual survival times., Comment: Accepted to the Interpretable Machine Learning for Healthcare (IMLH) workshop of the ICML 2022 Conference
- Published
- 2024
8. Targeting Relative Risk Heterogeneity with Causal Forests
- Author
-
Shirvaikar, Vik and Holmes, Chris
- Subjects
Statistics - Methodology ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Treatment effect heterogeneity (TEH), or variability in treatment effect for different subgroups within a population, is of significant interest in clinical trial analysis. Causal forests (Wager and Athey, 2018) is a highly popular method for this problem, but like many other methods for detecting TEH, its criterion for separating subgroups focuses on differences in absolute risk. This can dilute statistical power by masking nuance in the relative risk, which is often a more appropriate quantity of clinical interest. In this work, we propose and implement a methodology for modifying causal forests to target relative risk using a novel node-splitting procedure based on generalized linear model (GLM) comparison. We present results on simulated and real-world data that suggest relative risk causal forests can capture otherwise unobserved sources of heterogeneity., Comment: 10 pages, 4 figures
- Published
- 2023
9. Differentially Private Statistical Inference through $\beta$-Divergence One Posterior Sampling
- Author
-
Jewson, Jack, Ghalebikesabi, Sahra, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning ,Mathematics - Statistics Theory - Abstract
Differential privacy guarantees allow the results of a statistical analysis involving sensitive data to be released without compromising the privacy of any individual taking part. Achieving such guarantees generally requires the injection of noise, either directly into parameter estimates or into the estimation process. Instead of artificially introducing perturbations, sampling from Bayesian posterior distributions has been shown to be a special case of the exponential mechanism, producing consistent, and efficient private estimates without altering the data generative process. The application of current approaches has, however, been limited by their strong bounding assumptions which do not hold for basic models, such as simple linear regressors. To ameliorate this, we propose $\beta$D-Bayes, a posterior sampling scheme from a generalised posterior targeting the minimisation of the $\beta$-divergence between the model and the data generating process. This provides private estimation that is generally applicable without requiring changes to the underlying model and consistently learns the data generating parameter. We show that $\beta$D-Bayes produces more precise inference estimation for the same privacy guarantees, and further facilitates differentially private estimation via posterior sampling for complex classifiers and continuous regression models such as neural networks for the first time.
- Published
- 2023
10. Challenges and Opportunities of Shapley values in a Clinical Context
- Author
-
Ter-Minassian, Lucile, Ghalebikesabi, Sahra, Diaz-Ordaz, Karla, and Holmes, Chris
- Subjects
Statistics - Methodology - Abstract
With the adoption of machine learning-based solutions in routine clinical practice, the need for reliable interpretability tools has become pressing. Shapley values provide local explanations. The method gained popularity in recent years. Here, we reveal current misconceptions about the ``true to the data'' or ``true to the model'' trade-off and demonstrate its importance in a clinical context. We show that the interpretation of Shapley values, which strongly depends on the choice of a reference distribution for modeling feature removal, is often misunderstood. We further advocate that for applications in medicine, the reference distribution should be tailored to the underlying clinical question. Finally, we advise on the right reference distributions for specific medical use cases.
- Published
- 2023
11. PWSHAP: A Path-Wise Explanation Model for Targeted Variables
- Author
-
Ter-Minassian, Lucile, Clivio, Oscar, Diaz-Ordaz, Karla, Evans, Robin J., and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Predictive black-box models can exhibit high accuracy but their opaque nature hinders their uptake in safety-critical deployment environments. Explanation methods (XAI) can provide confidence for decision-making through increased transparency. However, existing XAI methods are not tailored towards models in sensitive domains where one predictor is of special interest, such as a treatment effect in a clinical model, or ethnicity in policy models. We introduce Path-Wise Shapley effects (PWSHAP), a framework for assessing the targeted effect of a binary (e.g.~treatment) variable from a complex outcome model. Our approach augments the predictive model with a user-defined directed acyclic graph (DAG). The method then uses the graph alongside on-manifold Shapley values to identify effects along causal pathways whilst maintaining robustness to adversarial attacks. We establish error bounds for the identified path-wise Shapley effects and for Shapley values. We show PWSHAP can perform local bias and mediation analyses with faithfulness to the model. Further, if the targeted variable is randomised we can quantify local effect modification. We demonstrate the resolution, interpretability, and true locality of our approach on examples and a real-world experiment.
- Published
- 2023
12. Semiparametric posterior corrections
- Author
-
Yiu, Andrew, Fong, Edwin, Holmes, Chris, and Rousseau, Judith
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
We present a new approach to semiparametric inference using corrected posterior distributions. The method allows us to leverage the adaptivity, regularization and predictive power of nonparametric Bayesian procedures to estimate low-dimensional functionals of interest without being restricted by the holistic Bayesian formalism. Starting from a conventional nonparametric posterior, we target the functional of interest by transforming the entire distribution with a Bayesian bootstrap correction. We provide conditions for the resulting $\textit{one-step posterior}$ to possess calibrated frequentist properties and specialize the results for several canonical examples: the integrated squared density, the mean of a missing-at-random outcome, and the average causal treatment effect on the treated. The procedure is computationally attractive, requiring only a simple, efficient post-processing step that can be attached onto any arbitrary posterior sampling algorithm. Using the ACIC 2016 causal data analysis competition, we illustrate that our approach can outperform the existing state-of-the-art through the propagation of Bayesian uncertainty., Comment: 53 pages
- Published
- 2023
13. A Unified Framework for U-Net Design and Analysis
- Author
-
Williams, Christopher, Falck, Fabian, Deligiannidis, George, Holmes, Chris, Doucet, Arnaud, and Syed, Saifuddin
- Subjects
Statistics - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
U-Nets are a go-to, state-of-the-art neural architecture across numerous tasks for continuous signals on a square such as images and Partial Differential Equations (PDE), however their design and architecture is understudied. In this paper, we provide a framework for designing and analysing general U-Net architectures. We present theoretical results which characterise the role of the encoder and decoder in a U-Net, their high-resolution scaling limits and their conjugacy to ResNets via preconditioning. We propose Multi-ResNets, U-Nets with a simplified, wavelet-based encoder without learnable parameters. Further, we show how to design novel U-Net architectures which encode function constraints, natural bases, or the geometry of the data. In diffusion models, our framework enables us to identify that high-frequency information is dominated by noise exponentially faster, and show how U-Nets with average pooling exploit this. In our experiments, we demonstrate how Multi-ResNets achieve competitive and often superior performance compared to classical U-Nets in image segmentation, PDE surrogate modelling, and generative modelling with diffusion models. Our U-Net framework paves the way to study the theoretical properties of U-Nets and design natural, scalable neural architectures for a multitude of problems beyond the square.
- Published
- 2023
14. Learning from data with structured missingness
- Author
-
Mitra, Robin, McGough, Sarah F., Chakraborti, Tapabrata, Holmes, Chris, Copping, Ryan, Hagenbuch, Niels, Biedermann, Stefanie, Noonan, Jack, Lehmann, Brieuc, Shenvi, Aditi, Doan, Xuan Vinh, Leslie, David, Bianconi, Ginestra, Sanchez-Garcia, Ruben, Davies, Alisha, Mackintosh, Maxine, Andrinopoulou, Eleni-Rosalina, Basiri, Anahid, Harbron, Chris, and MacArthur, Ben D.
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such `structured missingness' raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here, we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
- Published
- 2023
15. Characterising the genetic architecture of changes in adiposity during adulthood using electronic health records
- Author
-
Venkatesh, Samvida S., Ganjgahi, Habib, Palmer, Duncan S., Coley, Kayesha, Linchangco, Jr., Gregorio V., Hui, Qin, Wilson, Peter, Ho, Yuk-Lam, Cho, Kelly, Arumäe, Kadri, Wittemans, Laura B. L., Nellåker, Christoffer, Vainik, Uku, Sun, Yan V., Holmes, Chris, Lindgren, Cecilia M., and Nicholson, George
- Published
- 2024
- Full Text
- View/download PDF
16. A large-scale and PCR-referenced vocal audio dataset for COVID-19
- Author
-
Budd, Jobie, Baker, Kieran, Karoune, Emma, Coppock, Harry, Patel, Selina, Payne, Richard, Tendero Cañadas, Ana, Titcomb, Alexander, Hurley, David, Egglestone, Sabrina, Butler, Lorraine, Mellor, Jonathon, Nicholson, George, Kiskin, Ivan, Koutra, Vasiliki, Jersakova, Radka, McKendry, Rachel A., Diggle, Peter, Richardson, Sylvia, Schuller, Björn W., Gilmour, Steven, Pigoli, Davide, Roberts, Stephen, Packham, Josef, Thornley, Tracey, and Holmes, Chris
- Published
- 2024
- Full Text
- View/download PDF
17. To do no harm — and the most good — with AI in health care
- Author
-
Goldberg, Carey Beth, Adams, Laura, Blumenthal, David, Brennan, Patricia Flatley, Brown, Noah, Butte, Atul J., Cheatham, Morgan, deBronkart, Dave, Dixon, Jennifer, Drazen, Jeffrey, Evans, Barbara J., Hoffman, Sara M., Holmes, Chris, Lee, Peter, Manrai, Arjun Kumar, Omenn, Gilbert S., Perlin, Jonathan B., Ramoni, Rachel, Sapiro, Guillermo, Sarkar, Rupa, Sood, Harpreet, Vayena, Effy, and Kohane, Isaac S.
- Published
- 2024
- Full Text
- View/download PDF
18. Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers
- Author
-
Coppock, Harry, Nicholson, George, Kiskin, Ivan, Koutra, Vasiliki, Baker, Kieran, Budd, Jobie, Payne, Richard, Karoune, Emma, Hurley, David, Titcomb, Alexander, Egglestone, Sabrina, Tendero Cañadas, Ana, Butler, Lorraine, Jersakova, Radka, Mellor, Jonathon, Patel, Selina, Thornley, Tracey, Diggle, Peter, Richardson, Sylvia, Packham, Josef, Schuller, Björn W., Pigoli, Davide, Gilmour, Steven, Roberts, Stephen, and Holmes, Chris
- Published
- 2024
- Full Text
- View/download PDF
19. On the Stability of General Bayesian Inference
- Author
-
Jewson, Jack, Smith, Jim Q., and Holmes, Chris
- Subjects
Statistics - Methodology - Abstract
We study the stability of posterior predictive inferences to the specification of the likelihood model and perturbations of the data generating process. In modern big data analyses, useful broad structural judgements may be elicited from the decision-maker but a level of interpolation is required to arrive at a likelihood model. As a result, an often computationally convenient canonical form is used in place of the decision-maker's true beliefs. Equally, in practice, observational datasets often contain unforeseen heterogeneities and recording errors and therefore do not necessarily correspond to how the process was idealised by the decision-maker. Acknowledging such imprecisions, a faithful Bayesian analysis should ideally be stable across reasonable equivalence classes of such inputs. We are able to guarantee that traditional Bayesian updating provides stability across only a very strict class of likelihood models and data generating processes, requiring the decision-maker to elicit their beliefs and understand how the data was generated with an unreasonable degree of accuracy. On the other hand, a generalised Bayesian alternative using the $\beta$-divergence loss function is shown to be stable across practical and interpretable neighbourhoods, providing assurances that posterior inferences are not overly dependent on accidentally introduced spurious specifications or data collection errors. We illustrate this in linear regression, binary classification, and mixture modelling examples, showing that stable updating does not compromise the ability to learn about the data generating process. These stability results provide a compelling justification for using generalised Bayes to facilitate inference under simplified canonical models., Comment: 29 pages, 7 figures
- Published
- 2023
20. A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs
- Author
-
Falck, Fabian, Williams, Christopher, Danks, Dominic, Deligiannidis, George, Yau, Christopher, Holmes, Chris, Doucet, Arnaud, and Willetts, Matthew
- Subjects
Statistics - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
U-Net architectures are ubiquitous in state-of-the-art deep learning, however their regularisation properties and relationship to wavelets are understudied. In this paper, we formulate a multi-resolution framework which identifies U-Nets as finite-dimensional truncations of models on an infinite-dimensional function space. We provide theoretical results which prove that average pooling corresponds to projection within the space of square-integrable functions and show that U-Nets with average pooling implicitly learn a Haar wavelet basis representation of the data. We then leverage our framework to identify state-of-the-art hierarchical VAEs (HVAEs), which have a U-Net architecture, as a type of two-step forward Euler discretisation of multi-resolution diffusion processes which flow from a point mass, introducing sampling instabilities. We also demonstrate that HVAEs learn a representation of time which allows for improved parameter efficiency through weight-sharing. We use this observation to achieve state-of-the-art HVAE performance with half the number of parameters of existing models, exploiting the properties of our continuous-time formulation., Comment: NeurIPS 2022 (selected as oral)
- Published
- 2023
21. Causal Falsification of Digital Twins
- Author
-
Cornish, Rob, Taufiq, Muhammad Faaiz, Doucet, Arnaud, and Holmes, Chris
- Subjects
Statistics - Methodology ,Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Machine Learning ,Statistics - Applications - Abstract
Digital twins are virtual systems designed to predict how a real-world process will evolve in response to interventions. This modelling paradigm holds substantial promise in many applications, but rigorous procedures for assessing their accuracy are essential for safety-critical settings. We consider how to assess the accuracy of a digital twin using real-world data. We formulate this as causal inference problem, which leads to a precise definition of what it means for a twin to be "correct" appropriate for many applications. Unfortunately, fundamental results from causal inference mean observational data cannot be used to certify that a twin is correct in this sense unless potentially tenuous assumptions are made, such as that the data are unconfounded. To avoid these assumptions, we propose instead to find situations in which the twin is not correct, and present a general-purpose statistical procedure for doing so. Our approach yields reliable and actionable information about the twin under only the assumption of an i.i.d. dataset of observational trajectories, and remains sound even if the data are confounded. We apply our methodology to a large-scale, real-world case study involving sepsis modelling within the Pulse Physiology Engine, which we assess using the MIMIC-III dataset of ICU patients.
- Published
- 2023
22. A large-scale and PCR-referenced vocal audio dataset for COVID-19
- Author
-
Budd, Jobie, Baker, Kieran, Karoune, Emma, Coppock, Harry, Patel, Selina, Cañadas, Ana Tendero, Titcomb, Alexander, Payne, Richard, Hurley, David, Egglestone, Sabrina, Butler, Lorraine, Mellor, Jonathon, Nicholson, George, Kiskin, Ivan, Koutra, Vasiliki, Jersakova, Radka, McKendry, Rachel A., Diggle, Peter, Richardson, Sylvia, Schuller, Björn W., Gilmour, Steven, Pigoli, Davide, Roberts, Stephen, Packham, Josef, Thornley, Tracey, and Holmes, Chris
- Subjects
Computer Science - Sound ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results., Comment: 39 pages, 4 figures
- Published
- 2022
23. Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers
- Author
-
Coppock, Harry, Nicholson, George, Kiskin, Ivan, Koutra, Vasiliki, Baker, Kieran, Budd, Jobie, Payne, Richard, Karoune, Emma, Hurley, David, Titcomb, Alexander, Egglestone, Sabrina, Cañadas, Ana Tendero, Butler, Lorraine, Jersakova, Radka, Mellor, Jonathon, Patel, Selina, Thornley, Tracey, Diggle, Peter, Richardson, Sylvia, Packham, Josef, Schuller, Björn W., Pigoli, Davide, Gilmour, Steven, Roberts, Stephen, and Holmes, Chris
- Subjects
Computer Science - Sound ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata, including reverse transcription polymerase chain reaction (PCR) test outcomes, of whom 23,514 tested positive for SARS CoV 2. Subjects were recruited via the UK governments National Health Service Test-and-Trace programme and the REal-time Assessment of Community Transmission (REACT) randomised surveillance survey. In an unadjusted analysis of our dataset AI classifiers predict SARS-CoV-2 infection status with high accuracy (Receiver Operating Characteristic Area Under the Curve (ROCAUC) 0.846 [0.838, 0.854]) consistent with the findings of previous studies. However, after matching on measured confounders, such as age, gender, and self reported symptoms, our classifiers performance is much weaker (ROC-AUC 0.619 [0.594, 0.644]). Upon quantifying the utility of audio based classifiers in practical settings, we find them to be outperformed by simple predictive scores based on user reported symptoms.
- Published
- 2022
24. Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19
- Author
-
Pigoli, Davide, Baker, Kieran, Budd, Jobie, Butler, Lorraine, Coppock, Harry, Egglestone, Sabrina, Gilmour, Steven G., Holmes, Chris, Hurley, David, Jersakova, Radka, Kiskin, Ivan, Koutra, Vasiliki, Mellor, Jonathon, Nicholson, George, Packham, Joe, Patel, Selina, Payne, Richard, Roberts, Stephen J., Schuller, Björn W., Tendero-Cañadas, Ana, Thornley, Tracey, and Titcomb, Alexander
- Subjects
Computer Science - Sound ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Audio and Speech Processing ,Statistics - Applications - Abstract
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.
- Published
- 2022
25. Generating the right evidence at the right time: Principles of a new class of flexible augmented clinical trial designs
- Author
-
Dunger-Baldauf, Cornelia, Hemmings, Rob, Bretz, Frank, Jones, Byron, Schiel, Anja, and Holmes, Chris
- Subjects
Statistics - Methodology - Abstract
The past few years have seen an increasing number of initiatives aimed at integrating information generated outside of confirmatory randomised clinical trials (RCTs) into drug development. However, data generated non-concurrently and through observational studies can provide results that are difficult to compare with randomised trial data. Moreover, the scientific questions these data can serve to answer often remain vague. Our starting point is to use clearly defined objectives for evidence generation, which are formulated towards early discussion with health technology assessment (HTA) bodies and are additional to regulatory requirements for authorisation of a new treatment. We propose FACTIVE (Flexible Augmented Clinical Trial for Improved eVidencE generation), a new class of study designs enabling flexible augmentation of confirmatory randomised controlled trials with concurrent and close-to-real-world elements. These enabling designs facilitate estimation of certain treatment effects in the confirmatory part and other, complementary treatment effects in a concurrent real-world part. Each stakeholder should use the evidence that is relevant within their own decision-making framework. High quality data are generated under one single protocol and the use of randomisation ensures rigorous statistical inference and interpretation within and between the different parts of the experiment. Evidence for the decision-making of HTA bodies could be available earlier than is currently the case.
- Published
- 2022
26. Assessments and developments in constructing a National Health Index for policy making, in the United Kingdom
- Author
-
Freni-Sterrantino, Anna, Prescott, Thomas P, Ceely, Greg, Glickman, Myer, and Holmes, Chris
- Subjects
Statistics - Applications - Abstract
Composite indicators are a useful tool to summarize, measure and compare changes among different communities. The UK Office for National Statistics has created an annual England Health Index (starting from 2015) comprised of three main health domains - lives, places and people - to monitor health measures, over time and across different geographical areas (149 Upper Tier Level Authorities, 9 regions and an overall national index) and to evaluate the health of the nation. The composite indicator is defined as a weighted average (linear combination) of indicators within subdomains, subdomains within domains, and domains within the overall index. The Health Index was designed to be comparable over time, geographically harmonized and to serve as a tool for policy implementation and assessment. We evaluated the steps taken in the construction, reviewing the conceptual coherence and statistical requirements on Health Index data for 2015-2018. To assess these, we have focused on three main steps: correlation analysis at different index levels; comparison of the implemented weights derived from factor analysis with two alternative weights from principal components analysis and optimized system weights; a sensitivity and uncertainty analysis to assess to what extent rankings depend on the selected set of methodological choices. Based on the results, we have highlighted features that have improved statistical requirements of the forthcoming UK Health Index., Comment: 25 pages, 11 figures
- Published
- 2022
27. Causal predictive inference and target trial emulation
- Author
-
Yiu, Andrew, Fong, Edwin, Walker, Stephen, and Holmes, Chris
- Subjects
Statistics - Methodology - Abstract
Causal inference from observational data can be viewed as a missing data problem arising from a hypothetical population-scale randomized trial matched to the observational study. This links a target trial protocol with a corresponding generative predictive model for inference, providing a complete framework for transparent communication of causal assumptions and statistical uncertainty on treatment effects, without the need for counterfactuals. The intuitive foundation for the work is that a whole population randomized trial would provide answers to any observable causal question with certainty. Thus, our fundamental problem of causal inference is the missingness of the hypothetical target trial data, which we solve through repeated imputation from a generative predictive model conditioned on the observational data. Causal assumptions map to intuitive conditions on the transportability of predictive models across populations and conditions. We demonstrate our approach on a real data application to studying the effects of maternal smoking on birthweights using extensions of Bayesian additive regression trees and inverse probability weighting.
- Published
- 2022
28. Age-dependent topic modeling of comorbidities in UK Biobank identifies disease subtypes with differential genetic risk
- Author
-
Jiang, Xilin, Zhang, Martin Jinye, Zhang, Yidong, Durvasula, Arun, Inouye, Michael, Holmes, Chris, Price, Alkes L., and McVean, Gil
- Published
- 2023
- Full Text
- View/download PDF
29. Bayesian Lesion Estimation with a Structured Spike-and-Slab Prior
- Author
-
Menacher, Anna, Nichols, Thomas E., Holmes, Chris, and Ganjgahi, Habib
- Subjects
Statistics - Methodology ,Statistics - Applications - Abstract
Neural demyelination and brain damage accumulated in white matter appear as hyperintense areas on T2-weighted MRI scans in the form of lesions. Modeling binary images at the population level, where each voxel represents the existence of a lesion, plays an important role in understanding aging and inflammatory diseases. We propose a scalable hierarchical Bayesian spatial model, called BLESS, capable of handling binary responses by placing continuous spike-and-slab mixture priors on spatially-varying parameters and enforcing spatial dependency on the parameter dictating the amount of sparsity within the probability of inclusion. The use of mean-field variational inference with dynamic posterior exploration, which is an annealing-like strategy that improves optimization, allows our method to scale to large sample sizes. Our method also accounts for underestimation of posterior variance due to variational inference by providing an approximate posterior sampling approach based on Bayesian bootstrap ideas and spike-and-slab priors with random shrinkage targets. Besides accurate uncertainty quantification, this approach is capable of producing novel cluster size based imaging statistics, such as credible intervals of cluster size, and measures of reliability of cluster occurrence. Lastly, we validate our results via simulation studies and an application to the UK Biobank, a large-scale lesion mapping study with a sample size of 40,000 subjects., Comment: For supplementary materials, see https://drive.google.com/file/d/1vr154MEsxv00OMeZQR8R4ecpd5V35qCa/view?usp=sharing . For code, see https://github.com/annamenacher/BLESS ${.}$
- Published
- 2022
30. Quasi-Bayesian Nonparametric Density Estimation via Autoregressive Predictive Updates
- Author
-
Ghalebikesabi, Sahra, Holmes, Chris, Fong, Edwin, and Lehmann, Brieuc
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning ,Statistics - Methodology - Abstract
Bayesian methods are a popular choice for statistical inference in small-data regimes due to the regularization effect induced by the prior. In the context of density estimation, the standard nonparametric Bayesian approach is to target the posterior predictive of the Dirichlet process mixture model. In general, direct estimation of the posterior predictive is intractable and so methods typically resort to approximating the posterior distribution as an intermediate step. The recent development of quasi-Bayesian predictive copula updates, however, has made it possible to perform tractable predictive density estimation without the need for posterior approximation. Although these estimators are computationally appealing, they tend to struggle on non-smooth data distributions. This is due to the comparatively restrictive form of the likelihood models from which the proposed copula updates were derived. To address this shortcoming, we consider a Bayesian nonparametric model with an autoregressive likelihood decomposition and a Gaussian process prior. While the predictive update of such a model is typically intractable, we derive a quasi-Bayesian predictive update that achieves state-of-the-art results in small-data regimes.
- Published
- 2022
31. Neural Score Matching for High-Dimensional Causal Inference
- Author
-
Clivio, Oscar, Falck, Fabian, Lehmann, Brieuc, Deligiannidis, George, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Traditional methods for matching in causal inference are impractical for high-dimensional datasets. They suffer from the curse of dimensionality: exact matching and coarsened exact matching find exponentially fewer matches as the input dimension grows, and propensity score matching may match highly unrelated units together. To overcome this problem, we develop theoretical results which motivate the use of neural networks to obtain non-trivial, multivariate balancing scores of a chosen level of coarseness, in contrast to the classical, scalar propensity score. We leverage these balancing scores to perform matching for high-dimensional causal inference and call this procedure neural score matching. We show that our method is competitive against other matching approaches on semi-synthetic high-dimensional datasets, both in terms of treatment effect estimation and reducing imbalance., Comment: To appear in AISTATS 2022
- Published
- 2022
32. A Graph Based Neural Network Approach to Immune Profiling of Multiplexed Tissue Samples
- Author
-
Martin, Natalia Garcia, Malacrino, Stefano, Wojciechowska, Marta, Campo, Leticia, Jones, Helen, Wedge, David C., Holmes, Chris, Sirinukunwattana, Korsuk, Sailem, Heba, Verrill, Clare, and Rittscher, Jens
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Quantitative Biology - Cell Behavior ,Quantitative Biology - Quantitative Methods - Abstract
Multiplexed immunofluorescence provides an unprecedented opportunity for studying specific cell-to-cell and cell microenvironment interactions. We employ graph neural networks to combine features obtained from tissue morphology with measurements of protein expression to profile the tumour microenvironment associated with different tumour stages. Our framework presents a new approach to analysing and processing these complex multi-dimensional datasets that overcomes some of the key challenges in analysing these data and opens up the opportunity to abstract biologically meaningful interactions.
- Published
- 2022
- Full Text
- View/download PDF
33. Interoperability of statistical models in pandemic preparedness: principles and reality
- Author
-
Nicholson, George, Blangiardo, Marta, Briers, Mark, Diggle, Peter J., Fjelde, Tor Erlend, Ge, Hong, Goudie, Robert J. B., Jersakova, Radka, King, Ruairidh E., Lehmann, Brieuc C. L., Mallon, Ann-Marie, Padellini, Tullia, Teh, Yee Whye, Holmes, Chris, and Richardson, Sylvia
- Subjects
Statistics - Methodology ,Statistics - Applications ,62P10 - Abstract
We present "interoperability" as a guiding framework for statistical modelling to assist policy makers asking multiple questions using diverse datasets in the face of an evolving pandemic response. Interoperability provides an important set of principles for future pandemic preparedness, through the joint design and deployment of adaptable systems of statistical models for disease surveillance using probabilistic reasoning. We illustrate this through case studies for inferring spatial-temporal coronavirus disease 2019 (COVID-19) prevalence and reproduction numbers in England., Comment: 26 pages, 10 figures, for associated mpeg file Movie 1 please see https://www.dropbox.com/s/kn9y1v6zvivfla1/Interoperability_of_models_Movie_1.mp4?dl=0
- Published
- 2021
34. Mitigating Statistical Bias within Differentially Private Synthetic Data
- Author
-
Ghalebikesabi, Sahra, Wilde, Harrison, Jewson, Jack, Doucet, Arnaud, Vollmer, Sebastian, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Increasing interest in privacy-preserving machine learning has led to new and evolved approaches for generating private synthetic data from undisclosed real data. However, mechanisms of privacy preservation can significantly reduce the utility of synthetic data, which in turn impacts downstream tasks such as learning predictive models or inference. We propose several re-weighting strategies using privatised likelihood ratios that not only mitigate statistical bias of downstream estimators but also have general applicability to differentially private generative models. Through large-scale empirical evaluation, we show that private importance weighting provides simple and effective privacy-compliant augmentation for general applications of synthetic data.
- Published
- 2021
35. On Locality of Local Explanation Models
- Author
-
Ghalebikesabi, Sahra, Ter-Minassian, Lucile, Diaz-Ordaz, Karla, and Holmes, Chris
- Subjects
Computer Science - Machine Learning ,Statistics - Computation ,Statistics - Methodology ,Statistics - Machine Learning - Abstract
Shapley values provide model agnostic feature attributions for model outcome at a particular instance by simulating feature absence under a global population distribution. The use of a global population can lead to potentially misleading results when local model behaviour is of interest. Hence we consider the formulation of neighbourhood reference distributions that improve the local interpretability of Shapley values. By doing so, we find that the Nadaraya-Watson estimator, a well-studied kernel regressor, can be expressed as a self-normalised importance sampling estimator. Empirically, we observe that Neighbourhood Shapley values identify meaningful sparse feature relevance attributions that provide insight into local model behaviour, complimenting conventional Shapley analysis. They also increase on-manifold explainability and robustness to the construction of adversarial classifiers., Comment: Submitted to NeurIPS 2021
- Published
- 2021
36. Conformal Bayesian Computation
- Author
-
Fong, Edwin and Holmes, Chris
- Subjects
Statistics - Methodology ,Statistics - Computation - Abstract
We develop scalable methods for producing conformal Bayesian predictive intervals with finite sample calibration guarantees. Bayesian posterior predictive distributions, $p(y \mid x)$, characterize subjective beliefs on outcomes of interest, $y$, conditional on predictors, $x$. Bayesian prediction is well-calibrated when the model is true, but the predictive intervals may exhibit poor empirical coverage when the model is misspecified, under the so called ${\cal{M}}$-open perspective. In contrast, conformal inference provides finite sample frequentist guarantees on predictive confidence intervals without the requirement of model fidelity. Using 'add-one-in' importance sampling, we show that conformal Bayesian predictive intervals are efficiently obtained from re-weighted posterior samples of model parameters. Our approach contrasts with existing conformal methods that require expensive refitting of models or data-splitting to achieve computational efficiency. We demonstrate the utility on a range of examples including extensions to partially exchangeable settings such as hierarchical models., Comment: 19 pages, 4 figures, 12 tables; added references and fixed typos
- Published
- 2021
37. Multi-Facet Clustering Variational Autoencoders
- Author
-
Falck, Fabian, Zhang, Haoting, Willetts, Matthew, Nicholson, George, Yau, Christopher, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Statistics - Methodology - Abstract
Work in deep clustering focuses on finding a single partition of data. However, high-dimensional data, such as images, typically feature multiple interesting characteristics one could cluster over. For example, images of objects against a background could be clustered over the shape of the object and separately by the colour of the background. In this paper, we introduce Multi-Facet Clustering Variational Autoencoders (MFCVAE), a novel class of variational autoencoders with a hierarchy of latent variables, each with a Mixture-of-Gaussians prior, that learns multiple clusterings simultaneously, and is trained fully unsupervised and end-to-end. MFCVAE uses a progressively-trained ladder architecture which leads to highly stable performance. We provide novel theoretical results for optimising the ELBO analytically with respect to the categorical variational posterior distribution, correcting earlier influential theoretical work. On image benchmarks, we demonstrate that our approach separates out and clusters over different aspects of the data in a disentangled manner. We also show other advantages of our model: the compositionality of its latent space and that it provides controlled generation of samples., Comment: Advances in Neural Information Processing Systems 34 (NeurIPS 2021)
- Published
- 2021
38. Development and validation of a risk model for hospital-acquired venous thrombosis: the Medical Inpatients Thrombosis and Hemostasis study
- Author
-
Zakai, Neil A., Wilkinson, Katherine, Sparks, Andrew D., Packer, Ryan T., Koh, Insu, Roetker, Nicholas S., Repp, Allen B., Thomas, Ryan, Holmes, Chris E., Cushman, Mary, Plante, Timothy B., Al-Samkari, Hanny, Pishko, Allyson M., Wood, William A., Masias, Camila, Gangaraju, Radhika, Li, Ang, Garcia, David, Wiggins, Kerri L., Schaefer, Jordan K., Hooper, Craig, Smith, Nicholas L., and McClure, Leslie A.
- Published
- 2024
- Full Text
- View/download PDF
39. Martingale posterior distributions
- Author
-
Fong, Edwin, Holmes, Chris, and Walker, Stephen G.
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
The prior distribution on parameters of a sampling distribution is the usual starting point for Bayesian uncertainty quantification. In this paper, we present a different perspective which focuses on missing observations as the source of statistical uncertainty, with the parameter of interest being known precisely given the entire population. We argue that the foundation of Bayesian inference is to assign a distribution on missing observations conditional on what has been observed. In the conditionally i.i.d. setting with an observed sample of size $n$, the Bayesian would thus assign a predictive distribution on the missing $Y_{n+1:\infty}$ conditional on $Y_{1:n}$, which then induces a distribution on the parameter. Demonstrating an application of martingales, Doob shows that choosing the Bayesian predictive distribution returns the conventional posterior as the distribution of the parameter. Taking this as our cue, we relax the predictive machine, avoiding the need for the predictive to be derived solely from the usual prior to posterior to predictive density formula. We introduce the \textit{martingale posterior distribution}, which returns Bayesian uncertainty directly on any statistic of interest without the need for the likelihood and prior, and this distribution can be sampled through a computational scheme we name \textit{predictive resampling}. To that end, we introduce new predictive methodologies for multivariate density estimation, regression and classification that build upon recent work on bivariate copulas., Comment: 62 pages, 22 figures, 3 tables; added discussion on frequentist consistency, included more context, added references
- Published
- 2021
40. Bayesian imputation of COVID-19 positive test counts for nowcasting under reporting lag
- Author
-
Jersakova, Radka, Lomax, James, Hetherington, James, Lehmann, Brieuc, Nicholson, George, Briers, Mark, and Holmes, Chris
- Subjects
Statistics - Applications - Abstract
Obtaining up to date information on the number of UK COVID-19 regional infections is hampered by the reporting lag in positive test results for people with COVID-19 symptoms. In the UK, for "Pillar 2" swab tests for those showing symptoms, it can take up to five days for results to be collated. We make use of the stability of the under reporting process over time to motivate a statistical temporal model that infers the final total count given the partial count information as it arrives. We adopt a Bayesian approach that provides for subjective priors on parameters and a hierarchical structure for an underlying latent intensity process for the infection counts. This results in a smoothed time-series representation now-casting the expected number of daily counts of positive tests with uncertainty bands that can be used to aid decision making. Inference is performed using sequential Monte Carlo.
- Published
- 2021
41. Deep Generative Pattern-Set Mixture Models for Nonignorable Missingness
- Author
-
Ghalebikesabi, Sahra, Cornish, Rob, Kelly, Luke J., and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
We propose a variational autoencoder architecture to model both ignorable and nonignorable missing data using pattern-set mixtures as proposed by Little (1993). Our model explicitly learns to cluster the missing data into missingness pattern sets based on the observed data and missingness masks. Underpinning our approach is the assumption that the data distribution under missingness is probabilistically semi-supervised by samples from the observed data distribution. Our setup trades off the characteristics of ignorable and nonignorable missingness and can thus be applied to data of both types. We evaluate our method on a wide range of data sets with different types of missingness and achieve state-of-the-art imputation performance. Our model outperforms many common imputation algorithms, especially when the amount of missing data is high and the missingness mechanism is nonignorable., Comment: International Conference on Artificial Intelligence and Statistics (AISTATS)
- Published
- 2021
42. Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections
- Author
-
Camuto, Alexander, Wang, Xiaoyu, Zhu, Lingjiong, Holmes, Chris, Gürbüzbalaban, Mert, and Şimşekli, Umut
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Gaussian noise injections (GNIs) are a family of simple and widely-used regularisation methods for training neural networks, where one injects additive or multiplicative Gaussian noise to the network activations at every iteration of the optimisation algorithm, which is typically chosen as stochastic gradient descent (SGD). In this paper we focus on the so-called `implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of SGD. We show that this effect induces an asymmetric heavy-tailed noise on SGD gradient updates. In order to model this modified dynamics, we first develop a Langevin-like stochastic differential equation that is driven by a general family of asymmetric heavy-tailed noise. Using this model we then formally prove that GNIs induce an `implicit bias', which varies depending on the heaviness of the tails and the level of asymmetry. Our empirical results confirm that different types of neural networks trained with GNIs are well-modelled by the proposed dynamics and that the implicit effect of these injections induces a bias that degrades the performance of networks., Comment: Main paper of 12 pages, followed by appendix
- Published
- 2021
43. Foundations of Bayesian Learning from Synthetic Data
- Author
-
Wilde, Harrison, Jewson, Jack, Vollmer, Sebastian, and Holmes, Chris
- Subjects
Computer Science - Machine Learning ,Statistics - Applications ,Statistics - Methodology ,Statistics - Machine Learning - Abstract
There is significant growth and interest in the use of synthetic data as an enabler for machine learning in environments where the release of real data is restricted due to privacy or availability constraints. Despite a large number of methods for synthetic data generation, there are comparatively few results on the statistical properties of models learnt on synthetic data, and fewer still for situations where a researcher wishes to augment real data with another party's synthesised data. We use a Bayesian paradigm to characterise the updating of model parameters when learning in these settings, demonstrating that caution should be taken when applying conventional learning algorithms without appropriate consideration of the synthetic data generating process and learning task. Recent results from general Bayesian updating support a novel and robust approach to Bayesian synthetic-learning founded on decision theory that outperforms standard approaches across repeated experiments on supervised learning and inference problems., Comment: 43 pages (10 main text, 33 supplement), 32 figures (4 main text, 28 supplement)
- Published
- 2020
44. Towards a Theoretical Understanding of the Robustness of Variational Autoencoders
- Author
-
Camuto, Alexander, Willetts, Matthew, Roberts, Stephen, Holmes, Chris, and Rainforth, Tom
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
We make inroads into understanding the robustness of Variational Autoencoders (VAEs) to adversarial attacks and other input perturbations. While previous work has developed algorithmic approaches to attacking and defending VAEs, there remains a lack of formalization for what it means for a VAE to be robust. To address this, we develop a novel criterion for robustness in probabilistic models: $r$-robustness. We then use this to construct the first theoretical results for the robustness of VAEs, deriving margins in the input space for which we can provide guarantees about the resulting reconstruction. Informally, we are able to define a region within which any perturbation will produce a reconstruction that is similar to the original reconstruction. To support our analysis, we show that VAEs trained using disentangling methods not only score well under our robustness metrics, but that the reasons for this can be interpreted through our theoretical results., Comment: 8 pages
- Published
- 2020
45. Relaxed-Responsibility Hierarchical Discrete VAEs
- Author
-
Willetts, Matthew, Miscouridou, Xenia, Roberts, Stephen, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Successfully training Variational Autoencoders (VAEs) with a hierarchy of discrete latent variables remains an area of active research. Vector-Quantised VAEs are a powerful approach to discrete VAEs, but naive hierarchical extensions can be unstable when training. Leveraging insights from classical methods of inference we introduce \textit{Relaxed-Responsibility Vector-Quantisation}, a novel way to parameterise discrete latent variables, a refinement of relaxed Vector-Quantisation that gives better performance and more stable training. This enables a novel approach to hierarchical discrete variational autoencoders with numerous layers of latent variables (here up to 32) that we train end-to-end. Within hierarchical probabilistic deep generative models with discrete latent variables trained end-to-end, we achieve state-of-the-art bits-per-dim results for various standard datasets. % Unlike discrete VAEs with a single layer of latent variables, we can produce samples by ancestral sampling: it is not essential to train a second autoregressive generative model over the learnt latent representations to then sample from and then decode. % Moreover, that latter approach in these deep hierarchical models would require thousands of forward passes to generate a single sample. Further, we observe different layers of our model become associated with different aspects of the data., Comment: 10 Pages
- Published
- 2020
46. Explicit Regularisation in Gaussian Noise Injections
- Author
-
Camuto, Alexander, Willetts, Matthew, Şimşekli, Umut, Roberts, Stephen, and Holmes, Chris
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
We study the regularisation induced in neural networks by Gaussian noise injections (GNIs). Though such injections have been extensively studied when applied to data, there have been few studies on understanding the regularising effect they induce when applied to network activations. Here we derive the explicit regulariser of GNIs, obtained by marginalising out the injected noise, and show that it penalises functions with high-frequency components in the Fourier domain; particularly in layers closer to a neural network's output. We show analytically and empirically that such regularisation produces calibrated classifiers with large classification margins.
- Published
- 2020
47. Inferring proximity from Bluetooth Low Energy RSSI with Unscented Kalman Smoothers
- Author
-
Lovett, Tom, Briers, Mark, Charalambides, Marcos, Jersakova, Radka, Lomax, James, and Holmes, Chris
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Computers and Society ,Statistics - Applications ,Statistics - Machine Learning - Abstract
The Covid-19 pandemic has resulted in a variety of approaches for managing infection outbreaks in international populations. One example is mobile phone applications, which attempt to alert infected individuals and their contacts by automatically inferring two key components of infection risk: the proximity to an individual who may be infected, and the duration of proximity. The former component, proximity, relies on Bluetooth Low Energy (BLE) Received Signal Strength Indicator(RSSI) as a distance sensor, and this has been shown to be problematic; not least because of unpredictable variations caused by different device types, device location on-body, device orientation, the local environment and the general noise associated with radio frequency propagation. In this paper, we present an approach that infers posterior probabilities over distance given sequences of RSSI values. Using a single-dimensional Unscented Kalman Smoother (UKS) for non-linear state space modelling, we outline several Gaussian process observation transforms, including: a generative model that directly captures sources of variation; and a discriminative model that learns a suitable observation function from training data using both distance and infection risk as optimisation objective functions. Our results show that good risk prediction can be achieved in $\mathcal{O}(n)$ time on real-world data sets, with the UKS outperforming more traditional classification methods learned from the same training data.
- Published
- 2020
48. Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
- Author
-
Zaidi, Sheheryar, Zela, Arber, Elsken, Thomas, Holmes, Chris, Hutter, Frank, and Teh, Yee Whye
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift. \emph{Deep ensembles}, a state-of-the-art method for uncertainty estimation, only ensemble random initializations of a \emph{fixed} architecture. Instead, we propose two methods for automatically constructing ensembles with \emph{varying} architectures, which implicitly trade-off individual architectures' strengths against the ensemble's diversity and exploit architectural variation as a source of diversity. On a variety of classification tasks and modern architecture search spaces, we show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift. Our further analysis and ablation studies provide evidence of higher ensemble diversity due to architectural variation, resulting in ensembles that can outperform deep ensembles, even when having weaker average base learners. To foster reproducibility, our code is available: \url{https://github.com/automl/nes}, Comment: Accepted at NeurIPS 2021; earlier version of this work was accepted for oral presentation at ICML 2020 Workshop on Uncertainty & Robustness in Deep Learning
- Published
- 2020
49. Risk scoring calculation for the current NHSx contact tracing app
- Author
-
Briers, Mark, Charalambides, Marcos, and Holmes, Chris
- Subjects
Computer Science - Computers and Society ,Physics - Physics and Society ,Quantitative Biology - Populations and Evolution - Abstract
We consider how the NHS COVID-19 application will initially calculate a risk score for an individual based on their recent contact with people who report that they have coronavirus symptoms., Comment: 13 pages, 8 figures
- Published
- 2020
50. Learning Bijective Feature Maps for Linear ICA
- Author
-
Camuto, Alexander, Willetts, Matthew, Paige, Brooks, Holmes, Chris, and Roberts, Stephen
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Statistics - Machine Learning - Abstract
Separating high-dimensional data like images into independent latent factors, i.e independent component analysis (ICA), remains an open research problem. As we show, existing probabilistic deep generative models (DGMs), which are tailor-made for image data, underperform on non-linear ICA tasks. To address this, we propose a DGM which combines bijective feature maps with a linear ICA model to learn interpretable latent structures for high-dimensional data. Given the complexities of jointly training such a hybrid model, we introduce novel theory that constrains linear ICA to lie close to the manifold of orthogonal rectangular matrices, the Stiefel manifold. By doing so we create models that converge quickly, are easy to train, and achieve better unsupervised latent factor discovery than flow-based models, linear ICA, and Variational Autoencoders on images., Comment: 8 pages
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.