Abstract: The Bayesian treatment of neural networks dictates that a prior distribution is specified over their weight and bias parameters. This poses a challenge because modern neural networks are characterized by a large number of parameters and nonlinearities, and the choice of these priors has an unpredictable effect on the distribution of the functions that these models can represent. Differently, Gaussian processes offer a rigorous nonparametric framework to define prior distributions over the space of functions. In this talk, I will introduce a novel and robust framework to impose such functional priors on modern neural networks through minimizing the Wasserstein distance between samples of stochastic processes. I will then show experiments demonstrating that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements over alternative choices of priors and state-of-the-art approximate Bayesian deep learning approaches.
Speaker: Maurizio Filippone is an associate professor and AXA Chair of Computational Statistics at EURECOM, France. He has over 70 publications in Bayesian statistics and inference of Gaussian processes and neural networks.
Place of Seminar: Otaniemi, T3 (zoom)