The goal of my research is to create trustworthy machine learning models with a focus on developing methods and theoretical insights that improve the reliability, safety, and transparency of machine learning systems deployed in safety-critical settings.
In pursuit of this goal, my research uses probabilistic methods to improve reliable uncertainty quantification [1,2,3], robustness to distribution shifts [1,2,3], interpretability [1,2], and sequential decision-making [1,2,3], with an emphasis on problems in healthcare [1,2,3].
Bio: I am a Data Science Faculty Fellow at New York University. Before joining New York University, I conducted PhD research on probabilistic machine learning in the Department of Computer Science at the University of Oxford, where I was advised by Yarin Gal and Yee Whye Teh. For my work on safe decision-making under uncertainty, I received the 2021 Qualcomm Innovation Fellowship. I care deeply about equitable access to education and was an Equality, Diversity & Inclusion Fellow at the University of Oxford. For further details, please see my CV.
I am also an AI Fellow at Georgetown’s Center for Security & Emerging Technology and a Rhodes Scholar.
Mentoring: I was the first in my family to attend college, and I know that navigating higher education can be challenging for first-generation low-income students. If you identify as a first-generation low-income student and are looking for mentorship, please feel free to get in touch using this form.
News
May '24
I published three papers at ICML [1,2,3], one paper at ICLR [1], and one paper (Oral) at AISTATS [1]!
Fine-tuning off-the-shelf pre-trained neural networks has become the default starting point for a wide range of challenging prediction tasks—especially in computer vision and natural language processing, where pre-trained models trained on millions or even billions of data points are publicly available and can be fine-tuned with a moderate compute budget. However, while fine-tuned models have been shown to significantly improve predictive performance in several respects compared to models trained from scratch, they can exhibit poor calibration and fail to reliably identify challenging distribution shifts. In this paper, we improve uncertainty quantification in fine-tuned models by constructing an uncertainty-aware fine-tuning prior and deriving a tractable variational objective for inference. The prior assigns high probability density to parameters that induce predictive functions with high uncertainty on data points that are meaningfully different from the data used for fine-tuning. We evaluate models trained with this prior on different transfer learning tasks and show that fine-tuning with uncertainty-aware priors significantly improves calibration, selective prediction, and semantic shift detection on computer vision and natural language classification tasks.
@inproceedings{rudner2024uap,title={Uncertainty-Aware Priors for Fine-Tuning Pre-trained Vision and Language Models},author={Rudner, Tim G. J. and Pan, Xiang and Li, Yucen Lily and Shwartz-Ziv, Ravid and Wilson, Andrew Gordon},booktitle={Preprint},year={2024},}
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, and Andrew Gordon Wilson
International Conference on Machine Learning(ICML), 2023
Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by viewing parameter-space regularization as specifying an empirical prior distribution over the model parameters, we can derive a probabilistically well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training. This method—which we refer to as function-space empirical Bayes (FS-EB)—includes both parameter- and function-space regularization, is mathematically simple, easy to implement, and incurs only minimal computational overhead compared to standard regularization techniques. We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection, highly-calibrated predictive uncertainty estimates, successful task adaption from pre-trained models, and improved generalization under covariate shift
@inproceedings{rudner2023fseb,title={{F}unction-{S}pace {R}egularization in {N}eural {N}etworks: {A} {P}robabilistic {P}erspective},author={Rudner, Tim G. J. and Kapoor, Sanyam and Qiu, Shikai and Wilson, Andrew Gordon},booktitle={Proceedings of the 40th International Conference on Machine Learning},year={2023},series={Proceedings of Machine Learning Research},publisher={PMLR},}
Tractable Function-Space Variational Inference in Bayesian Neural Networks
Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, and Yarin Gal
Advances in Neural Information Processing Systems(NeurIPS), 2022
Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference over neural network parameters makes it difficult to incorporate meaningful prior information about the data-generating process into the model. In this paper, we pursue an alternative approach. Recognizing that the primary object of interest in most settings is the distribution over functions induced by the posterior distribution over neural network parameters, we frame Bayesian inference in neural networks explicitly as inferring a posterior distribution over functions and propose a scalable function-space variational inference method that allows incorporating prior information and results in reliable predictive uncertainty estimates. We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks and demonstrate that it performs well on a challenging safety-critical medical diagnosis task in which reliable uncertainty estimation is essential.
@inproceedings{rudner2022fsvi,title={{T}ractable {F}unction-{S}pace {V}ariational {I}nference in {B}ayesian {N}eural {N}etworks},author={Rudner, Tim G. J. and Chen, Zonghao and Teh, Yee Whye and Gal, Yarin},booktitle={Advances in Neural Information Processing Systems 35},year={2022},}
Robustness to Distribution Shifts
Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors
Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, and Julia Kempe
International Conference on Artificial Intelligence and Statistics(AISTATS), 2024
Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well under subpopulation shifts. We design a simple group-aware prior that only requires access to a small set of data with group information and demonstrate that training with this prior yields state-of-the-art performance—even when only retraining the final layer of a previously trained non-robust model. Group aware-priors are conceptually simple, complementary to existing approaches, such as attribute pseudo labeling and data reweighting, and open up promising new avenues for harnessing Bayesian inference to enable robustness to subpopulation shifts.
@inproceedings{rudner2024gap,title={Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors},author={Rudner, Tim G. J. and Zhang, Ya Shi and Wilson, Andrew Gordon and Kempe, Julia},booktitle={Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},year={2024},}
Plex: Towards Reliability Using Pretrained Large Model Extensions
Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, and Balaji Lakshminarayanan
ICML Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward, 2022
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models’ abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it improves the out-of-the-box performance and does not require designing scores or tuning the model for each task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex’s capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding.
@inproceedings{tran2022plex,author={Tran, Dustin and Liu, Jeremiah and Dusenberry, Michael W. and Phan, Du and Collier, Mark and Ren, Jie and Han, Kehang and Wang, Zi and Mariet, Zelda and Hu, Huiyi and Band, Neil and Rudner, Tim G. J. and Singhal, Karan and Nado, Zachary and van Amersfoort, Joost and Kirsch, Andreas and Jenatton, Rodolphe and Thain, Nithum and Yuan, Honglin and Buchanan, Kelly and Murphy, Kevin and Sculley, D. and Gal, Yarin and Ghahramani, Zoubin and Snoek, Jasper and Lakshminarayanan, Balaji},title={{P}lex: {T}owards {R}eliability {U}sing {P}retrained {L}arge {M}odel {E}xtensions},year={2022},booktitle={ICML Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward},}
Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks
Neil Band*, Tim G. J. Rudner*, Qixuan Feng, Angelos Filos, Zachary Nado, Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, and Yarin Gal
Advances in Neural Information Processing Systems(NeurIPS), 2021
Bayesian deep learning seeks to equip deep neural networks with the ability to precisely quantify their predictive uncertainty, and has promised to make deep learning more reliable for safety-critical real-world applications. Yet, existing Bayesian deep learning methods fall short of this promise; new methods continue to be evaluated on unrealistic test beds that do not reflect the complexities of downstream real-world tasks that would benefit most from reliable uncertainty quantification. We propose the RETINA Benchmark, a set of real-world tasks that accurately reflect such complexities and are designed to assess the reliability of predictive models in safety-critical scenarios. Specifically, we curate two publicly available datasets of high-resolution human retina images exhibiting varying degrees of diabetic retinopathy, a medical condition that can lead to blindness, and use them to design a suite of automated diagnosis tasks that require reliable predictive uncertainty quantification. We use these tasks to benchmark well-established and state-of-the-art Bayesian deep learning methods on task-specific evaluation metrics. We provide an easy-to-use codebase for fast and easy benchmarking following reproducibility and software design principles. We provide implementations of all methods included in the benchmark as well as results computed over 100 TPU days, 20 GPU days, 400 hyperparameter configurations, and evaluation on at least 6 random seeds each.
@inproceedings{band2021benchmarking,title={{B}enchmarking {B}ayesian {D}eep {L}earning {o}n {D}iabetic {R}etinopathy {D}etection {T}asks},author={Band, Neil and Rudner, Tim G. J. and Feng, Qixuan and Filos, Angelos and Nado, Zachary and Dusenberry, Michael W. and Jerfel, Ghassen and Tran, Dustin and Gal, Yarin},booktitle={Advances in Neural Information Processing Systems 34},year={2021},}
Interpretability
Should We Learn Most Likely Functions or Parameters?
Shikai Qiu*, Tim G. J. Rudner*, Sanyam Kapoor, and Andrew Gordon Wilson
Advances in Neural Information Processing Systems(NeurIPS), 2023
Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generally correspond to the most likely function induced by the parameter posterior. In fact, we can re-parametrize a model such that any setting of parameters can maximize the parameter posterior. As an alternative, we investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data. We show that this procedure leads to pathological solutions when using neural networks and prove conditions under which the procedure is well-behaved, as well as a scalable approximation. Under these conditions, we find that function-space MAP estimation can lead to flatter minima, better generalization, and improved robustness to overfitting.
@inproceedings{rudner2023fsmap,title={Should We Learn Most Likely Functions or Parameters?},author={Qiu, Shikai and Rudner, Tim G. J. and Kapoor, Sanyam and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},year={2023},}
Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution
Ying Wang*, Tim G. J. Rudner*, and Andrew Gordon Wilson
Advances in Neural Information Processing Systems(NeurIPS), 2023
Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual and textual features. We demonstrate how M2IB can be applied to attribution analysis of vision-language pretrained models, increasing attribution accuracy and improving the interpretability of such models when applied to safety-critical domains such as healthcare. Crucially, unlike commonly used unimodal attribution methods, M2IB does not require ground truth labels, making it possible to audit representations of vision-language pretrained models when multiple modalities but no ground truth data is available. Using CLIP as an example, we demonstrate the effectiveness of M2IB attribution and show that it outperforms gradient-based, perturbation-based, and attention-based attribution methods both qualitatively and quantitatively.
@inproceedings{wang2023m2ib,title={Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution},author={Wang, Ying and Rudner, Tim G. J. and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},year={2023},}
Probabilistic Sequential Decision-Making
Continual Learning via Sequential Function-Space Variational Inference
Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, and Yarin Gal
International Conference on Machine Learning(ICML), 2022
Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data. However, applying it to neural networks has proved challenging in practice. Addressing the drawbacks of existing techniques, we propose an optimization objective derived by formulating continual learning as sequential function-space variational inference. In contrast to existing methods that regularize neural network parameters directly, this objective allows parameters to vary widely during training, enabling better adaptation to new tasks. Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions and more effective regularization. We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods while depending less on maintaining a set of representative points from previous tasks.
@inproceedings{rudner2022sfsvi,author={Rudner, Tim G. J. and Smith, Freddie Bickford and Feng, Qixuan and Teh, Yee Whye and Gal, Yarin},title={{C}ontinual {L}earning via {S}equential {F}unction-{S}pace {V}ariational {I}nference},booktitle={Proceedings of the 39th International Conference on Machine Learning},year={2022},series={Proceedings of Machine Learning Research},publisher={PMLR},}
Outcome-Driven Reinforcement Learning via Variational Inference
Tim G. J. Rudner*, Vitchyr H. Pong*, Rowan McAllister, Yarin Gal, and Sergey Levine
Advances in Neural Information Processing Systems(NeurIPS), 2021
While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient shaping to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rather than as a problem of maximizing rewards. To solve this inference problem, we establish a novel variational inference formulation that allows us to derive a well-shaped reward function which can be learned directly from environment interactions. From the corresponding variational objective, we also derive a new probabilistic Bellman backup operator and use it to develop an off-policy algorithm to solve goal-directed tasks. We empirically demonstrate that this method eliminates the need to hand-craft reward functions for a suite of diverse manipulation and locomotion tasks and leads to effective goal-directed behaviors.
@inproceedings{rudner2021odrl,title={{O}utcome-{D}riven {R}einforcement {L}earning via {V}ariational {I}nference},author={Rudner, Tim G. J. and Pong, Vitchyr H. and McAllister, Rowan and Gal, Yarin and Levine, Sergey},booktitle={Advances in Neural Information Processing Systems 34},year={2021},}
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
Tim G. J. Rudner*, Cong Lu*, Michael A. Osborne, Yarin Gal, and Yee Whye Teh
Advances in Neural Information Processing Systems(NeurIPS), 2021
KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.
@inproceedings{rudner2021pathologies,title={{O}n {P}athologies in {KL}-{R}egularized {R}einforcement {L}earning from {E}xpert {D}emonstrations},author={Rudner, Tim G. J. and Lu, Cong and Osborne, Michael A. and Gal, Yarin and Teh, Yee Whye},booktitle={Advances in Neural Information Processing Systems 34},year={2021},}
Clinical Decision-Making and Drug Discovery
Domain-Aware Guidance for Out-of-Distribution Molecular and Protein Design
Leo Klarner, Tim G. J. Rudner, and Yee Whye Teh. Garrett M. Morris
International Conference on Machine Learning(ICML), 2024
Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training distribution. Reliably sampling from optimal regions beyond the training data, however, remains an open challenge—with current methods predominantly focusing on modifying the diffusion process itself. Here, we explore a different approach and present a simple plug-and-play regularization framework that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. Our method is probabilistically motivated and leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes. We demonstrate significant improvements in performance for applications in chemistry, materials science, and protein design.
@inproceedings{klarner2024guided,title={Domain-Aware Guidance for Out-of-Distribution Molecular and Protein Design},author={Klarner, Leo and Rudner, Tim G. J. and Garrett M. Morris, Charlotte Deane, Yee Whye Teh.},booktitle={Proceedings of the 41th International Conference on Machine Learning},year={2024},series={Proceedings of Machine Learning Research},publisher={PMLR},}
Protein Design with Guided Discrete Diffusion
Nate Gruver, Samuel Stanton, Nathan C. Frey, Tim G. J. Rudner, Isidro Hotzel, Julien Lafrance-Vanasse, Arvind Rajpal, Kyunghyun Cho, and Andrew Gordon Wilson
Advances in Neural Information Processing Systems(NeurIPS), 2023
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to develop guided diffusion models for structure with inverse folding to recover sequences. In this work, we propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models that follows gradients in the hidden states of the denoising network. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods, including scarce data and challenging inverse design. Moreover, we use NOS to generalize LaMBO, a Bayesian optimization procedure for sequence design that facilitates multiple objectives and edit-based constraints. The resulting method, LaMBO-2, enables discrete diffusions and stronger performance with limited edits through a novel application of saliency maps. We apply LaMBO-2 to a real-world protein design task, optimizing antibodies for higher expression yield and binding affinity to several therapeutic targets under locality and developability constraints, attaining a 99% expression rate and 40% binding rate in exploratory in vitro experiments.
@inproceedings{gruver2023nos,title={Protein Design with Guided Discrete Diffusion},author={Gruver, Nate and Stanton, Samuel and Frey, Nathan C. and Rudner, Tim G. J. and Hotzel, Isidro and Lafrance-Vanasse, Julien and Rajpal, Arvind and Cho, Kyunghyun and Wilson, Andrew Gordon},booktitle={Advances in Neural Information Processing Systems 36},year={2023},}
Informative Priors Improve the Reliability of Multimodal Clinical Data Classification
Julian Lechuga Lopez, Tim G. J. Rudner, and Farah Shamout
Machine Learning for Health Symposium Findings(ML4H), 2023
Machine learning-aided clinical decision support has the potential to significantly improve patient care. However, existing efforts in this domain for principled quantification of uncertainty have largely been limited to applications of ad-hoc solutions that do not consistently improve reliability. In this work, we consider stochastic neural networks and design a tailor-made multimodal data-driven (M2D2) prior distribution over network parameters. We use simple and scalable Gaussian mean-field variational inference to train a Bayesian neural network using the M2D2 prior. We train and evaluate the proposed approach using clinical time-series data in MIMIC-IV and corresponding chest X-ray images in MIMIC-CXR for the classification of acute care conditions. Our empirical results show that the proposed method produces a more reliable predictive model compared to deterministic and Bayesian neural network baselines.
@inproceedings{lechuga2023m2d2,title={Informative Priors Improve the Reliability of Multimodal Clinical Data Classification},author={Lopez, Julian Lechuga and Rudner, Tim G. J. and Shamout, Farah},booktitle={Machine Learning for Health Symposium Findings},year={2023},}
Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions
Leo Klarner, Tim G. J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte Deane, and Yee Whye Teh
International Conference on Machine Learning(ICML), 2023
Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift—a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.
@inproceedings{klarner2023qsavi,title={{D}rug {D}iscovery {u}nder {C}ovariate {S}hift {w}ith {D}omain-{I}nformed {P}rior {D}istributions {o}ver {F}unctions},author={Klarner, Leo and Rudner, Tim G. J. and Reutlinger, Michael and Schindler, Torsten and Morris, Garrett M. and Deane, Charlotte and Teh, Yee Whye},booktitle={Proceedings of the 40th International Conference on Machine Learning},year={2023},series={Proceedings of Machine Learning Research},publisher={PMLR},}
Policy Reports & Issue Briefs
Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning
@inproceedings{rudner2023uncertainty,author={Rudner, Tim G. J. and Toner, Helen},title={{K}ey {C}oncepts in {AI} {S}afety: {R}eliable {U}ncertainty {Q}uantification in {M}achine {L}earning},booktitle={CSET Issue Briefs (Forthcoming)},year={2024},}
OECD Framework for the Classification of AI systems
As artificial intelligence (AI) integrates all sectors at a rapid pace, different AI systems bring different benefits and risks. In comparing virtual assistants, self-driving vehicles and video recommendations for children, it is easy to see that the benefits and risks of each are very different. Their specificities will require different approaches to policy making and governance. To help policy makers, regulators, legislators and others characterise AI systems deployed in specific contexts, the OECD has developed a user-friendly tool to evaluate AI systems from a policy perspective. It can be applied to the widest range of AI systems across the following dimensions: People & Planet; Economic Context; Data & Input; AI Model; and Task & Output. Each of the framework’s dimensions has a subset of properties and attributes to define and assess policy implications and to guide an innovative and trustworthy approach to AI as outlined in the OECD AI Principles.
@inproceedings{oecd2022classification,author={{(as a contributing author)}, OECD},title={OECD Framework for the Classification of AI systems},booktitle={{OECD} {D}igital {E}conomy {P}apers},year={2022},number={323},doi={https://doi.org/https://doi.org/10.1787/cb6d9eca-en},}
Key Concepts in AI Safety: Specification in Machine Learning
This paper is the fourth installment in a series on "AI safety," an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” outlined three categories of AI safety issues—problems of robustness, assurance, and specification—and the subsequent two papers described problems of robustness and assurance, respectively. This paper introduces specification as a key element in designing modern machine learning systems that operate as intended.
@inproceedings{rudner2021specification,author={Rudner, Tim G. J. and Toner, Helen},title={{K}ey {C}oncepts in {AI} {S}afety: {S}pecification in {M}achine {L}earning},booktitle={CSET Issue Briefs},year={2021},}
Key Concepts in AI Safety: Interpretability in Machine Learning
This paper is the third installment in a series on "AI safety," an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces interpretability as a means to enable assurance in modern machine learning systems.
@inproceedings{rudner2021interpretability,author={Rudner, Tim G. J. and Toner, Helen},title={{K}ey {C}oncepts in {AI} {S}afety: {I}nterpretability in {M}achine {L}earning},booktitle={CSET Issue Briefs},year={2021},}
Key Concepts in AI Safety: Robustness and Adversarial Examples
This paper is the second installment in a series on "AI safety," an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces adversarial examples, a major challenge to robustness in modern machine learning systems.
@inproceedings{rudner2021robustness,author={Rudner, Tim G. J. and Toner, Helen},title={{K}ey {C}oncepts in {AI} {S}afety: {R}obustness and {A}dversarial {E}xamples},booktitle={CSET Issue Briefs},year={2021},}
This paper is the first installment in a series on "AI safety," an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.
@inproceedings{rudner2021aisafety,author={Rudner, Tim G. J. and Toner, Helen},title={{K}ey {C}oncepts in {AI} {S}afety: {A}n {O}verview},booktitle={CSET Issue Briefs},year={2021},}