You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

1 code implementation • NeurIPS 2021 • Guy Lorberbom, Daniel D. Johnson, Chris J. Maddison, Daniel Tarlow, Tamir Hazan

To perform counterfactual reasoning in Structural Causal Models (SCMs), one needs to know the causal mechanisms, which provide factorizations of conditional distributions into noise sources and deterministic functions mapping realizations of noise to samples.

no code implementations • NeurIPS Workshop ICBINB 2021 • Wouter Kool, Chris J. Maddison, andriy mnih

Training large-scale mixture of experts models efficiently on modern hardware requires assigning datapoints in a batch to different experts, each with a limited capacity.

1 code implementation • NeurIPS 2021 • Yann Dubois, Benjamin Bloem-Reddy, Karen Ullrich, Chris J. Maddison

Most data is automatically collected and only ever "seen" by algorithms.

Ranked #1 on Image Compression on ImageNet (using extra training data)

no code implementations • 28 May 2021 • Xuechen Li, Chris J. Maddison, Daniel Tarlow

Source code spends most of its time in a broken or incomplete state during software development.

1 code implementation • ICLR Workshop Neural_Compression 2021 • Yangjun Ruan, Karen Ullrich, Daniel Severo, James Townsend, Ashish Khisti, Arnaud Doucet, Alireza Makhzani, Chris J. Maddison

Naively applied, our schemes would require more initial bits than the standard bits-back coder, but we show how to drastically reduce this additional cost with couplings in the latent space.

1 code implementation • 8 Feb 2021 • Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables.

4 code implementations • ICLR 2021 • Max B. Paulus, Chris J. Maddison, Andreas Krause

Gradient estimation in models with discrete latent variables is a challenging problem, because the simplest unbiased estimators tend to have high variance.

no code implementations • 7 Jul 2020 • Pashootan Vaezipoor, Gil Lederman, Yuhuai Wu, Chris J. Maddison, Roger Grosse, Edward Lee, Sanjit A. Seshia, Fahiem Bacchus

Propositional model counting or #SAT is the problem of computing the number of satisfying assignments of a Boolean formula and many discrete probabilistic inference problems can be translated into a model counting problem to be solved by #SAT solvers.

1 code implementation • NeurIPS 2020 • Max B. Paulus, Dami Choi, Daniel Tarlow, Andreas Krause, Chris J. Maddison

The Gumbel-Max trick is the basis of many relaxed gradient estimators.

no code implementations • 11 Oct 2019 • Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl

In particular, we find that the popular adaptive gradient methods never underperform momentum or gradient descent.

no code implementations • NeurIPS 2020 • Guy Lorberbom, Chris J. Maddison, Nicolas Heess, Tamir Hazan, Daniel Tarlow

A main benefit of DirPG algorithms is that they allow the insertion of domain knowledge in the form of upper bounds on return-to-go at training time, like is used in heuristic search, while still directly computing a policy gradient.

3 code implementations • NeurIPS 2019 • Emile Mathieu, Charline Le Lan, Chris J. Maddison, Ryota Tomioka, Yee Whye Teh

We therefore endow VAEs with a Poincar\'e ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space.

3 code implementations • ICLR 2019 • George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison

Burda et al. (2015) introduced a multi-sample variational bound, IWAE, that is at least as tight as the standard variational lower bound and becomes increasingly tight as the number of samples increases.

4 code implementations • 13 Sep 2018 • Chris J. Maddison, Daniel Paulin, Yee Whye Teh, Brendan O'Donoghue, Arnaud Doucet

Yet, crucially the kinetic gradient map can be designed to incorporate information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex.

16 code implementations • ICML 2018 • Marta Garnelo, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, S. M. Ali Eslami

Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function.

1 code implementation • ICML 2018 • Tom Rainforth, Adam R. Kosiorek, Tuan Anh Le, Chris J. Maddison, Maximilian Igl, Frank Wood, Yee Whye Teh

We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio of the gradient estimator.

3 code implementations • NeurIPS 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, andriy mnih, Arnaud Doucet, Yee Whye Teh

When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.

3 code implementations • NeurIPS 2017 • George Tucker, andriy mnih, Chris J. Maddison, Dieterich Lawson, Jascha Sohl-Dickstein

Learning in models with discrete latent variables is challenging due to high variance gradient estimators.

no code implementations • 16 Mar 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, andriy mnih, Yee Whye Teh

The policy gradients of the expected return objective can react slowly to rare rewards.

4 code implementations • 2 Nov 2016 • Chris J. Maddison, andriy mnih, Yee Whye Teh

The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution.

1 code implementation • 20 Dec 2014 • Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver

The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function.

no code implementations • NeurIPS 2014 • Chris J. Maddison, Daniel Tarlow, Tom Minka

The problem of drawing samples from a discrete distribution can be converted into a discrete optimization problem.

no code implementations • 2 Jan 2014 • Chris J. Maddison, Daniel Tarlow

We study the problem of building generative models of natural source code (NSC); that is, source code written and understood by humans.

no code implementations • NeurIPS 2013 • Roger B. Grosse, Chris J. Maddison, Ruslan R. Salakhutdinov

Many powerful Monte Carlo techniques for estimating partition functions, such as annealed importance sampling (AIS), are based on sampling from a sequence of intermediate distributions which interpolate between a tractable initial distribution and an intractable target distribution.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.