Standard regularization methods for parametric models explicitly penalize certain parameter values. However, in most modern machine learning models, model parameters bear little to no physical meaning and the primary goal of training is to find the most likely function under the model that can fit the data. In this work, we investigate the implications of directly estimating the most likely function implied by the model and the data. We show that doing so can lead to pathological solutions under certain parameterizations and provide necessary conditions under which the objective function is well-behaved. We show that, under these conditions, function-space maximum a posteriori estimation leads to flatter minima, better generalization, and improved robustness to overfitting, compared to explicit parameter regularization. We verify these properties via empirical evaluations in idealized synthetic data settings as well as on commonly used benchmarking datasets.
@inproceedings{rudner2023fsmap,title={Should We Learn Most Likely Functions or Parameters?},author={Rudner, Tim G. J. and Kapoor, Sanyam and Qiu, Shikai and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},booktitle_show={Advances in Neural Information Processing Systems (forthcoming)},year={2023},placeholder={},area={Function-Space Inference}}
Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution
Ying Wang, Tim G. J. Rudner, and Andrew Gordon Wilson
Advances in Neural Information Processing Systems (forthcoming)(NeurIPS), 2023
Vision-language pretrained models have seen remarkable success, but their application to high-impact safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models, we propose a multi-modal information bottleneck (M2IB) objective that compresses irrelevant and noisy information while preserving relevant visual and textual features. We demonstrate how M2IB can be applied to attribution analysis of vision-language pretrained models, increasing attribution accuracy and improving the interpretability of such models when applied to safety-critical domains such as medical diagnosis. Unlike commonly used unimodal attribution methods, M2IB does not require ground truth labels, making it possible to audit representations of vision-language pretrained models when multiple modalities but no ground truth data is available. Using CLIP as an example, we demonstrate the effectiveness of M2IB attribution and show that it outperforms CAM-based attribution methods both qualitatively and quantitatively.
@inproceedings{wang2023m2ib,title={Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution},author={Wang, Ying and Rudner, Tim G. J. and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},booktitle_show={Advances in Neural Information Processing Systems (forthcoming)},year={2023},}
An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization
Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, and Yann LeCun
Advances in Neural Information Processing Systems (forthcoming)(NeurIPS), 2023
In this paper, we provide an information-theoretic perspective on Variance-Invariance-Covariance Regularization (VICReg) for self-supervised learning. To do so, we first demonstrate how information-theoretic quantities can be obtained for deterministic networks as an alternative to the commonly used unrealistic stochastic networks assumption. Next, we relate the VICReg objective to mutual information maximization and use it to highlight the underlying assumptions of the objective. Based on this relationship, we derive a generalization bound for VICReg, providing generalization guarantees for downstream supervised learning tasks and present new self-supervised learning methods, derived from a mutual information maximization objective, that outperform existing methods in terms of performance. This work provides a new information-theoretic perspective on self-supervised learning and Variance-Invariance-Covariance Regularization in particular and guides the way for improved transfer learning via information-theoretic self-supervised learning objectives.
@inproceedings{shwartz2023vicreg,title={An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization},author={Shwartz-Ziv, Ravid and Balestriero, Randall and Kawaguchi, Kenji and Rudner, Tim G. J. and LeCun, Yann},booktitle={Advances in Neural Information Processing Systems 36},booktitle_show={Advances in Neural Information Processing Systems (forthcoming)},year={2023},}
Protein Design with Guided Discrete Diffusion
Nate Gruver, Samuel Stanton, Nathan C. Frey, Tim G. J. Rudner, Isidro Hotzel, Julien Lafrance-Vanasse, Arvind Rajpal, Kyunghyun Cho, and Andrew Gordon Wilson
Advances in Neural Information Processing Systems (forthcoming)(NeurIPS), 2023
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to develop guided diffusion models for structure with inverse folding to recover sequences. In this work, we propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models that follows gradients in the hidden states of the denoising network. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods, including scarce data and challenging inverse design. Moreover, we use NOS to generalize LaMBO, a Bayesian optimization procedure for sequence design that facilitates multiple objectives and edit-based constraints. The resulting method, LaMBO-2, enables discrete diffusions and stronger performance with limited edits through a novel application of saliency maps. We apply LaMBO-2 to a real-world protein design task, optimizing antibodies for higher expression yield and binding affinity to a therapeutic target under locality and liability constraints, with 97% expression rate and 25% binding rate in exploratory in vitro experiments.
@inproceedings{gruver2023nos,title={Protein Design with Guided Discrete Diffusion},author={Gruver, Nate and Stanton, Samuel and Frey, Nathan C. and Rudner, Tim G. J. and Hotzel, Isidro and Lafrance-Vanasse, Julien and Rajpal, Arvind and Cho, Kyunghyun and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},booktitle_show={Advances in Neural Information Processing Systems (forthcoming)},year={2023},}
A Study of Bayesian Neural Network Surrogates for Bayesian Optimization
Yucen Lily Li, Tim G. J. Rudner, and Andrew Gordon Wilson
Symposium on Advances in Approximate Bayesian Inference(AABI), 2023
Bayesian optimization is a highly efficient approach to optimizing objective functions which are expensive to query. These objectives are typically represented by Gaussian process (GP) surrogate models which are easy to optimize and support exact inference. While standard GP surrogates have been well-established in Bayesian optimization, Bayesian neural networks (BNNs) have recently become practical function approximators, with many benefits over standard GPs such as the ability to naturally handle non-stationarity and learn representations for high-dimensional data. In this paper, we study BNNs as alternatives to standard GP surrogates for optimization. We consider a variety of approximate inference procedures for finite-width BNNs, including high-quality Hamiltonian Monte Carlo, low-cost stochastic MCMC, and heuristics such as deep ensembles. We also consider infinite-width BNNs and partially stochastic models such as deep kernel learning. We evaluate this collection of surrogate models on diverse problems with varying dimensionality, number of objectives, non-stationarity, and discrete and continuous inputs. We find: (i) the ranking of methods is highly problem dependent, suggesting the need for tailored inductive biases; (ii) HMC is the most successful approximate inference procedure for fully stochastic BNNs; (iii) full stochasticity may be unnecessary as deep kernel learning is relatively competitive; (iv) infinite-width BNNs are particularly promising, especially in high dimensions.
@inproceedings{li2023bostudy,title={A Study of Bayesian Neural Network Surrogates for Bayesian Optimization},author={Li, Yucen Lily and Rudner, Tim G. J. and Wilson, Andrew Gordon},booktitle={Fifth Symposium on Advances in Approximate Bayesian Inference},booktitle_show={Symposium on Advances in Approximate Bayesian Inference},year={2023},}
Attacking Bayes: Are Bayesian Neural Networks Inherently Robust?
Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, and Julia Kempe
Symposium on Advances in Approximate Bayesian Inference(AABI), 2023
This work examines the claim in recent work that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. To study this question, we investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines using even relatively unsophisticated attacks for three tasks: (1) label prediction under the posterior predictive mean, (2) adversarial example detection with Bayesian predictive uncertainty, and (3) semantic shift detection. We find that BNNs trained with state-of-the-art approximate inference methods, and even with HMC inference, are highly susceptible to adversarial attacks and identify various conceptual and experimental errors in previous works that claimed inherent adversarial robustness of BNNs. We conclusively demonstrate that BNNs and uncertainty-aware Bayesian prediction pipelines are not inherently robust against adversarial attacks and open up avenues for the development of Bayesian defenses for Bayesian prediction pipelines.
@inproceedings{feng2023attackingbayes,title={Attacking Bayes: Are Bayesian Neural Networks Inherently Robust?},author={Feng, Yunzhen and Rudner, Tim G. J. and Tsilivis, Nikolaos and Kempe, Julia},booktitle={Fifth Symposium on Advances in Approximate Bayesian Inference},booktitle_show={Symposium on Advances in Approximate Bayesian Inference},year={2023},}
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, and Andrew Gordon Wilson
International Conference on Machine Learning(ICML), 2023
Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by viewing parameter-space regularization as specifying an empirical prior distribution over the model parameters, we can derive a probabilistically well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. This method—which we refer to as function-space empirical Bayes (FS-EB)—includes both parameter- and function-space regularization, is mathematically simple, easy to implement, and incurs only minimal computational overhead compared to standard regularization techniques. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection, highly-calibrated predictive uncertainty estimates, successful task adaption from pre-trained models, and improved generalization under covariate shift
@inproceedings{rudner2023fseb,title={{F}unction-{S}pace {R}egularization in {N}eural {N}etworks: {A} {P}robabilistic {P}erspective},author={Rudner, Tim G. J. and Kapoor, Sanyam and Qiu, Shikai and Wilson, Andrew Gordon},booktitle={Proceedings of the 40th International Conference on Machine Learning},booktitle_show={International Conference on Machine Learning},year={2023},series={Proceedings of Machine Learning Research},publisher={PMLR},talk={https://icml.cc/virtual/2023/spotlight/24608},placeholder={},area={Informative Priors for Neural Networks}}
Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions
Leo Klarner, Tim G. J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte Deane, and Yee Whye Teh
International Conference on Machine Learning(ICML), 2023
Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift—a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.
@inproceedings{klarner2023qsavi,title={{D}rug {D}iscovery {u}nder {C}ovariate {S}hift {w}ith {D}omain-{I}nformed {P}rior {D}istributions {o}ver {F}unctions},author={Klarner, Leo and Rudner, Tim G. J. and Reutlinger, Michael and Schindler, Torsten and Morris, Garrett M. and Deane, Charlotte and Teh, Yee Whye},booktitle={Proceedings of the 40th International Conference on Machine Learning},booktitle_show={International Conference on Machine Learning},year={2023},series={Proceedings of Machine Learning Research},publisher={PMLR},talk={https://icml.cc/virtual/2023/spotlight/25037},placeholder={},area={Informative Priors for Neural Networks}}
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A. Osborne, and Yee Whye Teh
Transactions on Machine Learning Research(TMLR), 2023
Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, offline reinforcement learning from visual observations with continuous action spaces remains under-explored, with a limited understanding of the key challenges in this complex domain. In this paper, we establish simple baselines for continuous control in the visual domain and introduce a suite of benchmarking tasks for offline reinforcement learning from visual observations designed to better represent the data distributions present in real-world offline RL problems and guided by a set of desiderata for offline RL from visual observations, including robustness to visual distractions and visually identifiable changes in dynamics. Using this suite of benchmarking tasks, we show that simple modifications to two popular vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform existing offline RL methods and establish competitive baselines for continuous control in the visual domain. We rigorously evaluate these algorithms and perform an empirical evaluation of the differences between state-of-the-art model-based and model-free offline RL methods for continuous control from visual observations. All code and data used in this evaluation are open-sourced to facilitate progress in this domain.
@inproceedings{lu2023challenges,title={{C}hallenges and {O}pportunities in {O}ffline {R}einforcement {L}earning from {V}isual {O}bservations},author={Lu, Cong and Ball, Philip J. and Rudner, Tim G. J. and Parker-Holder, Jack and Osborne, Michael A. and Teh, Yee Whye},year={2023},booktitle={Transactions on Machine Learning Research},booktitle_show={Transactions on Machine Learning Research},issn={2835-8856},}
Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?
Gunshi Gupta, Tim G. J. Rudner, Rowan Thomas McAllister, Adrien Gaidon, and Yarin Gal
Conference on Causal Learning and Reasoning(CLeaR), 2023
Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, offline reinforcement learning from visual observations with continuous action spaces remains under-explored, with a limited understanding of the key challenges in this complex domain. In this paper, we establish simple baselines for continuous control in the visual domain and introduce a suite of benchmarking tasks for offline reinforcement learning from visual observations designed to better represent the data distributions present in real-world offline RL problems and guided by a set of desiderata for offline RL from visual observations, including robustness to visual distractions and visually identifiable changes in dynamics. Using this suite of benchmarking tasks, we show that simple modifications to two popular vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform existing offline RL methods and establish competitive baselines for continuous control in the visual domain. We rigorously evaluate these algorithms and perform an empirical evaluation of the differences between state-of-the-art model-based and model-free offline RL methods for continuous control from visual observations. All code and data used in this evaluation are open-sourced to facilitate progress in this domain.
@inproceedings{gupta2023activesampling,title={{C}an {A}ctive {S}ampling {R}educe {C}ausal {C}onfusion in {O}ffline {R}einforcement {L}earning?},author={Gupta, Gunshi and Rudner, Tim G. J. and McAllister, Rowan Thomas and Gaidon, Adrien and Gal, Yarin},booktitle={Proceedings of the 2nd Conference on Causal Learning and Reasoning},booktitle_show={Conference on Causal Learning and Reasoning},year={2023},series={Proceedings of Machine Learning Research},publisher={PMLR},talk={https://www.youtube.com/watch?v=tCViJVdd9aY}}
2022
Tractable Function-Space Variational Inference in Bayesian Neural Networks
Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, and Yarin Gal
Advances in Neural Information Processing Systems(NeurIPS), 2022
Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference over neural network parameters makes it difficult to incorporate meaningful prior information about the data-generating process into the model. In this paper, we pursue an alternative approach. Recognizing that the primary object of interest in most settings is the distribution over functions induced by the posterior distribution over neural network parameters, we frame Bayesian inference in neural networks explicitly as inferring a posterior distribution over functions and propose a scalable function-space variational inference method that allows incorporating prior information and results in reliable predictive uncertainty estimates. We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks and demonstrate that it performs well on a challenging safety-critical medical diagnosis task in which reliable uncertainty estimation is essential.
@inproceedings{rudner2022fsvi,title={{T}ractable {F}unction-{S}pace {V}ariational {I}nference in {B}ayesian {N}eural {N}etworks},author={Rudner, Tim G. J. and Chen, Zonghao and Teh, Yee Whye and Gal, Yarin},booktitle={Advances in Neural Information Processing Systems 35},booktitle_show={Advances in Neural Information Processing Systems},year={2022},placeholder={},area={Function-Space Inference}}
Continual Learning via Sequential Function-Space Variational Inference
Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, and Yarin Gal
International Conference on Machine Learning(ICML), 2022
Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data. However, applying it to neural networks has proved challenging in practice. Addressing the drawbacks of existing techniques, we propose an optimization objective derived by formulating continual learning as sequential function-space variational inference. In contrast to existing methods that regularize neural network parameters directly, this objective allows parameters to vary widely during training, enabling better adaptation to new tasks. Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions and more effective regularization. We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods while depending less on maintaining a set of representative points from previous tasks.
@inproceedings{rudner2022sfsvi,author={Rudner, Tim G. J. and Smith, Freddie Bickford and Feng, Qixuan and Teh, Yee Whye and Gal, Yarin},title={{C}ontinual {L}earning via {S}equential {F}unction-{S}pace {V}ariational {I}nference},booktitle={Proceedings of the 39th International Conference on Machine Learning},booktitle_show={International Conference on Machine Learning},year={2022},series={Proceedings of Machine Learning Research},publisher={PMLR},talk={https://icml.cc/virtual/2022/spotlight/18270},placeholder={},area={Sequential Decision-Making}}
Plex: Towards Reliability Using Pretrained Large Model Extensions
Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, and Balaji Lakshminarayanan
ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward, 2022
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models’ abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it improves the out-of-the-box performance and does not require designing scores or tuning the model for each task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex’s capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding.
@inproceedings{tran2022plex,author={Tran, Dustin and Liu, Jeremiah and Dusenberry, Michael W. and Phan, Du and Collier, Mark and Ren, Jie and Han, Kehang and Wang, Zi and Mariet, Zelda and Hu, Huiyi and Band, Neil and Rudner, Tim G. J. and Singhal, Karan and Nado, Zachary and van Amersfoort andAndreas Kirsch, Joost and Jenatton, Rodolphe and Thain, Nithum and Yuan, Honglin and Buchanan, Kelly and Murphy, Kevin and Sculley, D. and Gal, Yarin and Ghahramani, Zoubin and Snoek, Jasper and Lakshminarayanan, Balaji},title={{P}lex: {T}owards {R}eliability {U}sing {P}retrained {L}arge {M}odel {E}xtensions},year={2022},booktitle={ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward},booktitle_show={ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward},placeholder={},area={Reliable Machine Learning}}
2021
Outcome-Driven Reinforcement Learning via Variational Inference
Tim G. J. Rudner, Vitchyr H. Pong, Rowan McAllister, Yarin Gal, and Sergey Levine
Advances in Neural Information Processing Systems(NeurIPS), 2021
While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient shaping to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rather than as a problem of maximizing rewards. To solve this inference problem, we establish a novel variational inference formulation that allows us to derive a well-shaped reward function which can be learned directly from environment interactions. From the corresponding variational objective, we also derive a new probabilistic Bellman backup operator and use it to develop an off-policy algorithm to solve goal-directed tasks. We empirically demonstrate that this method eliminates the need to hand-craft reward functions for a suite of diverse manipulation and locomotion tasks and leads to effective goal-directed behaviors.
@inproceedings{rudner2021odrl,title={{O}utcome-{D}riven {R}einforcement {L}earning via {V}ariational {I}nference},author={Rudner, Tim G. J. and Pong, Vitchyr H. and McAllister, Rowan and Gal, Yarin and Levine, Sergey},booktitle={Advances in Neural Information Processing Systems 34},booktitle_show={Advances in Neural Information Processing Systems},year={2021},placeholder={},area={Sequential Decision-Making}}
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
Tim G. J. Rudner, Cong Lu, Michael A. Osborne, Yarin Gal, and Yee Whye Teh
Advances in Neural Information Processing Systems(NeurIPS), 2021
KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.
@inproceedings{rudner2021pathologies,title={{O}n {P}athologies in {KL}-{R}egularized {R}einforcement {L}earning from {E}xpert {D}emonstrations},author={Rudner, Tim G. J. and Lu, Cong and Osborne, Michael A. and Gal, Yarin and Teh, Yee Whye},booktitle={Advances in Neural Information Processing Systems 34},booktitle_show={Advances in Neural Information Processing Systems},year={2021},placeholder={},area={Informative Priors for Neural Networks}}
Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks
Neil Band, Tim G. J. Rudner, Qixuan Feng, Angelos Filos, Zachary Nado, Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, and Yarin Gal
Advances in Neural Information Processing Systems(NeurIPS), 2021
Bayesian deep learning seeks to equip deep neural networks with the ability to precisely quantify their predictive uncertainty, and has promised to make deep learning more reliable for safety-critical real-world applications. Yet, existing Bayesian deep learning methods fall short of this promise; new methods continue to be evaluated on unrealistic test beds that do not reflect the complexities of downstream real-world tasks that would benefit most from reliable uncertainty quantification. We propose the RETINA Benchmark, a set of real-world tasks that accurately reflect such complexities and are designed to assess the reliability of predictive models in safety-critical scenarios. Specifically, we curate two publicly available datasets of high-resolution human retina images exhibiting varying degrees of diabetic retinopathy, a medical condition that can lead to blindness, and use them to design a suite of automated diagnosis tasks that require reliable predictive uncertainty quantification. We use these tasks to benchmark well-established and state-of-the-art Bayesian deep learning methods on task-specific evaluation metrics. We provide an easy-to-use codebase for fast and easy benchmarking following reproducibility and software design principles. We provide implementations of all methods included in the benchmark as well as results computed over 100 TPU days, 20 GPU days, 400 hyperparameter configurations, and evaluation on at least 6 random seeds each.
@inproceedings{band2021benchmarking,title={{B}enchmarking {B}ayesian {D}eep {L}earning {o}n {D}iabetic {R}etinopathy {D}etection {T}asks},author={Band, Neil and Rudner, Tim G. J. and Feng, Qixuan and Filos, Angelos and Nado, Zachary and Dusenberry, Michael W. and Jerfel, Ghassen and Tran, Dustin and Gal, Yarin},booktitle={Advances in Neural Information Processing Systems 34},booktitle_show={Advances in Neural Information Processing Systems},year={2021},placeholder={},area={Reliable Machine Learning}}
Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning
Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal, and Dustin Tran
NeurIPS 2021 Workshop on Bayesian Deep Learning, 2021
High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compute availability for extensive tuning, incorporation of sufficiently many baselines, and concrete documentation for reproducibility. In this paper we introduce Uncertainty Baselines: high-quality implementations of standard and state-of-the-art deep learning methods on a variety of tasks. As of this writing, the collection spans 19 methods across 9 tasks, each with at least 5 metrics. Each baseline is a self-contained experiment pipeline with easily reusable and extendable components. Our goal is to provide immediate starting points for experimentation with new methods or applications. Additionally we provide model checkpoints, experiment outputs as Python notebooks, and leaderboards for comparing results.
@inproceedings{nado2021uncertaintybaselines,author={Nado, Zachary and Band, Neil and Collier, Mark and Djolonga, Josip and Dusenberry, Michael W. and Farquhar, Sebastian and Filos, Angelos and Havasi, Marton and Jenatton, Rodolphe and Jerfel, Ghassen and Liu, Jeremiah and Mariet, Zelda and Nixon, Jeremy and Padhy, Shreyas and Ren, Jie and Rudner, Tim G. J. and Wen, Yeming and Wenzel, Florian and Murphy, Kevin and Sculley, D. and Lakshminarayanan, Balaji and Snoek, Jasper and Gal, Yarin and Tran, Dustin},title={{U}ncertainty {B}aselines: {B}enchmarks {f}or {U}ncertainty {&} {R}obustness {i}n {D}eep {L}earning},year={2021},booktitle={NeurIPS 2021 Workshop on Bayesian Deep Learning},booktitle_show={NeurIPS 2021 Workshop on Bayesian Deep Learning},}
On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes
Tim G. J. Rudner, Oscar Key, Yarin Gal, and Tom Rainforth
International Conference on Machine Learning(ICML), 2021
We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues. Specifically, we show both theoretically and via an extensive empirical evaluation that the SNR of the gradient estimates for the latent variable’s variational parameters decreases as the number of importance samples increases. As a result, these gradient estimates degrade to pure noise if the number of importance samples is too large. To address this pathology, we show how doubly reparameterized gradient estimators, originally proposed for training variational autoencoders, can be adapted to the DGP setting and that the resultant estimators completely remedy the SNR issue, thereby providing more reliable training. Finally, we demonstrate that our fix can lead to consistent improvements in the predictive performance of DGP models.
@inproceedings{rudner2021snrissues,author={Rudner, Tim G. J. and Key, Oscar and Gal, Yarin and Rainforth, Tom},title={{O}n {S}ignal-to-{N}oise {R}atio {I}ssues in {V}ariational {I}nference for {D}eep {G}aussian {P}rocesses},booktitle={Proceedings of the 38th International Conference on Machine Learning},booktitle_show={International Conference on Machine Learning},year={2021},series={Proceedings of Machine Learning Research},address={Online},publisher={PMLR},placeholder={},area={Function-Space Inference}}
2020
Inter-domain Deep Gaussian Processes
Tim G. J. Rudner, Dino Sejdinovic, and Yarin Gal
International Conference on Machine Learning(ICML), 2020
Inter-domain Gaussian processes (GPs) allow for high flexibility and low computational cost when performing approximate inference in GP models. They are particularly suitable for modeling data exhibiting global structure but are limited to stationary covariance functions and thus fail to model non-stationary data effectively. We propose Inter-domain Deep Gaussian Processes, an extension of inter-domain shallow GPs that combines the advantages of inter-domain and deep Gaussian processes (DGPs), and demonstrate how to leverage existing approximate inference methods to perform simple and scalable approximate inference using inter-domain features in DGPs. We assess the performance of our method on a range of regression tasks and demonstrate that it outperforms inter-domain shallow GPs and conventional DGPs on challenging large-scale real-world datasets exhibiting both global structure as well as a high-degree of non-stationarity.
@inproceedings{rudner2020interdomaindgps,author={Rudner, Tim G. J. and Sejdinovic, Dino and Gal, Yarin},title={{I}nter-domain {D}eep {G}aussian {P}rocesses},booktitle={Proceedings of the 37th International Conference on Machine Learning},booktitle_show={International Conference on Machine Learning},year={2020},volume={119},series={Proceedings of Machine Learning Research},address={Online},publisher={PMLR},}
2019
VIREL: A Variational Inference Framework for Reinforcement Learning
Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, and Shimon Whiteson
Advances in Neural Information Processing Systems(NeurIPS), 2019
Applying probabilistic models to reinforcement learning (RL) enables the application of powerful optimisation tools such as variational inference to RL. However, existing inference frameworks and their algorithms pose significant challenges for learning optimal policies, e.g., the absence of mode capturing behaviour in pseudo-likelihood methods and difficulties learning deterministic policies in maximum entropy RL based approaches. We propose VIREL, a novel, theoretically grounded probabilistic inference framework for RL that utilises a parametrised action-value function to summarise future dynamics of the underlying MDP. This gives VIREL a mode-seeking form of KL divergence, the ability to learn deterministic optimal polices naturally from inference and the ability to optimise value functions and policies in separate, iterative steps. In applying variational expectation-maximisation to VIREL we thus show that the actor-critic algorithm can be reduced to expectation-maximisation, with policy improvement equivalent to an E-step and policy evaluation to an M-step. We then derive a family of actor-critic methods from VIREL, including a scheme for adaptive exploration. Finally, we demonstrate that actor-critic algorithms from this family outperform state-of-the-art methods based on soft value functions in several domains.
@inproceedings{fellows2019virel,author={Fellows, Matthew and Mahajan, Anuj and Rudner, Tim G. J. and Whiteson, Shimon},title={{VIREL}: {A} {V}ariational {I}nference {F}ramework for {R}einforcement {L}earning},booktitle={Advances in Neural Information Processing Systems 32},booktitle_show={Advances in Neural Information Processing Systems},year={2019},}
The Natural Neural Tangent Kernel: Neural Network Training Dynamics under Natural Gradient Descent
Tim G. J. Rudner, Florian Wenzel, Yee Whye Teh, and Yarin Gal
NeurIPS 2019 Workshop on Bayesian Deep Learning, 2019
Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods’ robustness and scalability, assessing whether new tools give ‘better’ uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experiments, are lacking: Methods that excel with these experiments often fail when used in application such as medical or automotive, suggesting a pertinent need for new benchmarks in the field. We propose a new BDL benchmark with a diverse set of tasks, inspired by a real-world medical imaging application on \emphdiabetic retinopathy diagnosis. Visual inputs (512x512 RGB images of retinas) are considered, where model uncertainty is used for medical pre-screening—i.e. to refer patients to an expert when model diagnosis is uncertain. Methods are then ranked according to metrics derived from expert-domain to reflect real-world use of model uncertainty in automated diagnosis. We develop multiple tasks that fall under this application, including out-of-distribution detection and robustness to distribution shift. We then perform a systematic comparison of well-tuned BDL techniques on the various tasks. From our comparison we conclude that some current techniques which solve benchmarks such as UCI ‘overfit’ their uncertainty to the dataset—when evaluated on our benchmark these underperform in comparison to simpler baselines. The code for the benchmark, its baselines, and a simple API for evaluating new BDL tools are made available.
@inproceedings{filos2019bdlb,author={Filos, Angelos and Farquhar, Sebastian and Gomez, Aidan N. and Rudner, Tim G. J. and Kenton, Zachary and Smith, Lewis and Alizadeh, Milad and de Kroon, Arnoud and Gal, Yarin},title={{A} {S}ystematic {C}omparison {o}f {B}ayesian {D}eep {L}earning {R}obustness in {D}iabetic {R}etinopathy {T}asks},year={2019},booktitle={NeurIPS 2019 Workshop on Bayesian Deep Learning},booktitle_show={NeurIPS 2019 Workshop on Bayesian Deep Learning},}
Multi³Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery
Tim G. J. Rudner, Marc Rußwurm, Jakub Fil, Ramona Pelich, Benjamin Bischke, Veronika Kopackova, and Piotr Bilinski
AAAI Conference on Artificial Intelligence(AAAI), 2019
Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods’ robustness and scalability, assessing whether new tools give ‘better’ uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experiments, are lacking: Methods that excel with these experiments often fail when used in application such as medical or automotive, suggesting a pertinent need for new benchmarks in the field. We propose a new BDL benchmark with a diverse set of tasks, inspired by a real-world medical imaging application on \emphdiabetic retinopathy diagnosis. Visual inputs (512x512 RGB images of retinas) are considered, where model uncertainty is used for medical pre-screening—i.e. to refer patients to an expert when model diagnosis is uncertain. Methods are then ranked according to metrics derived from expert-domain to reflect real-world use of model uncertainty in automated diagnosis. We develop multiple tasks that fall under this application, including out-of-distribution detection and robustness to distribution shift. We then perform a systematic comparison of well-tuned BDL techniques on the various tasks. From our comparison we conclude that some current techniques which solve benchmarks such as UCI ‘overfit’ their uncertainty to the dataset—when evaluated on our benchmark these underperform in comparison to simpler baselines. The code for the benchmark, its baselines, and a simple API for evaluating new BDL tools are made available at this https URL.
@inproceedings{rudner2019multi3net,author={Rudner, Tim G. J. and Rußwurm, Marc and Fil, Jakub and Pelich, Ramona and Bischke, Benjamin and Kopackova, Veronika and Bilinski, Piotr},title={{M}ulti{³}{N}et: {S}egmenting {F}looded {B}uildings via {F}usion of {M}ultiresolution, {M}ultisensor, and {M}ultitemporal {S}atellite {I}magery},booktitle={Proceedings of the Thirty-Three {AAAI} Conference on Artificial Intelligence},booktitle_show={{AAAI} Conference on Artificial Intelligence},year={2019},}
The StarCraft Multi-Agent Challenge
Mikayel Samvelyan, Tabish Rashid, Christian Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, and Shimon Whiteson
International Conference on Autonomous Agents and MultiAgent Systems(AAMAS), 2019
In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such problems are relevant to a large number of real-world systems and are also more amenable to evaluation than general-sum problems. Standardised environments such as the ALE and MuJoCo have allowed single-agent RL to move beyond toy domains, such as grid worlds. However, there is no comparable benchmark for cooperative multi-agent RL. As a result, most papers in this field use one-off toy problems, making it difficult to measure real progress. In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap. SMAC is based on the popular real-time strategy game StarCraft II and focuses on micromanagement challenges where each unit is controlled by an independent agent that must act based on local observations. We offer a diverse set of challenge maps and recommendations for best practices in benchmarking and evaluations. We also open-source a deep multi-agent RL learning framework including state-of-the-art algorithms. We believe that SMAC can provide a standard benchmark environment for years to come. Videos of our best agents for several SMAC scenarios are available at: this https URL.
@inproceedings{samvelyan19smac,author={Samvelyan, Mikayel and Rashid, Tabish and Schroeder de Witt, Christian and Farquhar, Gregory and Nardelli, Nantas and Rudner, Tim G. J. and Hung, Chia-Man and Torr, Philip H. S. and Foerster, Jakob and Whiteson, Shimon},title={{T}he {S}tar{C}raft {M}ulti-{A}gent {C}hallenge},booktitle={Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems},booktitle_show={International Conference on Autonomous Agents and MultiAgent Systems},year={2019},}
2018
On the Connection between Neural Processes and Gaussian Processes with Deep Kernels
Tim G. J. Rudner, Vincent Fortuin, Yee Whye Teh, and Yarin Gal
NeurIPS 2018 Workshop on Bayesian Deep Learning, 2018
@inproceedings{rudner2018npsasgps,author={Rudner, Tim G. J. and Fortuin, Vincent and Teh, Yee Whye and Gal, Yarin},title={{O}n the {C}onnection between {N}eural {P}rocesses and {G}aussian {P}rocesses with {D}eep {K}ernels},booktitle={NeurIPS 2018 Workshop on Bayesian Deep Learning},booktitle_show={NeurIPS 2018 Workshop on Bayesian Deep Learning},year={2018},}