My research focuses on resource-efficient methods for large-scale probabilistic machine learning. Much of my work views numerical algorithms through the lens of probabilistic inference. This perspective enables the acceleration of learning algorithms via an explicit trade-off between computational efficiency and predictive precision.
I received my PhD in Computer Science from the University of Tübingen advised by Philipp Hennig and I was an IMPRS-IS fellow at the Max-Planck Institute for Intelligent Systems.
Model selection in Gaussian processes scales prohibitively with the size of the training dataset, both in time and memory. While many approximations exist, all incur inevitable approximation error. Recent work accounts for this error in the form of computational uncertainty, which enables – at the cost of quadratic complexity – an explicit tradeoff between computation and precision. Here we extend this development to model selection, which requires significant enhancements to the existing approach, including linear-time scaling in the size of the dataset. We propose a novel training loss for hyperparameter optimization and demonstrate empirically that the resulting method can outperform SGPR, CGGP and SVGP, state-of-the-art methods for GP model selection, on medium to large-scale datasets. Our experiments show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty – a fundamental prerequisite for optimal decision-making.
@inproceedings{Wenger2024ComputationAwareGaussian,title={Computation-{Aware} {Gaussian} {Processes}: {Model} {Selection} {And} {Linear}-{Time} {Inference}},author={Wenger, Jonathan and Wu, Kaiwen and Hennig, Philipp and Gardner, Jacob R. and Pleiss, Geoff and Cunningham, John P.},booktitle={Advances in Neural Information Processing Systems (NeurIPS)},year={2024},doi={10.48550/arXiv.2411.01036},}
Posterior and Computational Uncertainty in Gaussian Processes
Jonathan Wenger, Geoff Pleiss, Marvin Pförtner, Philipp Hennig, and John P. Cunningham
In Advances in Neural Information Processing Systems (NeurIPS), 2022
Gaussian processes scale prohibitively with the size of the dataset. In response, many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. Therefore in practice, GP models are often as much about the approximation method as they are about the data. Here, we develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended. The most common GP approximations map to an instance in this class, such as methods based on the Cholesky factorization, conjugate gradients, and inducing points. For any method in this class, we prove (i) convergence of its posterior mean in the associated RKHS, (ii) decomposability of its combined posterior covariance into mathematical and computational covariances, and (iii) that the combined variance is a tight worst-case bound for the squared error between the method’s posterior mean and the latent function. Finally, we empirically demonstrate the consequences of ignoring computational uncertainty and show how implicitly modeling it improves generalization performance on benchmark datasets.
@inproceedings{Wenger2022PosteriorComputational,title={Posterior and Computational Uncertainty in {Gaussian} Processes},author={Wenger, Jonathan and Pleiss, Geoff and Pf{\"o}rtner, Marvin and Hennig, Philipp and Cunningham, John P.},booktitle={Advances in Neural Information Processing Systems (NeurIPS)},year={2022},doi={10.48550/arXiv.2205.15449},}
Non-Parametric Calibration for Classification
Jonathan Wenger, Hedvig Kjellström, and Rudolph Triebel
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Many applications of classification methods not only require high accuracy but also reliable estimation of predictive uncertainty. However, while many current classification frameworks, in particular deep neural networks, achieve high accuracy, they tend to incorrectly estimate uncertainty. In this paper, we propose a method that adjusts the confidence estimates of a general classifier such that they approach the probability of classifying correctly. In contrast to existing approaches, our calibration method employs a non-parametric representation using a latent Gaussian process, and is specifically designed for multi-class classification. It can be applied to any classifier that outputs confidence estimates and is not limited to neural networks. We also provide a theoretical analysis regarding the over- and underconfidence of a classifier and its relationship to calibration, as well as an empirical outlook for calibrated active learning. In experiments we show the universally strong performance of our method across different classifiers and benchmark data sets, in particular for state-of-the art neural network architectures.
@inproceedings{Wenger2020NonParametricCalibration,title={Non-{Parametric} {Calibration} for {Classification}},author={Wenger, Jonathan and Kjellström, Hedvig and Triebel, Rudolph},booktitle={International Conference on Artificial Intelligence and Statistics (AISTATS)},year={2020},doi={10.48550/arXiv.1906.04933},}