Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript.
Here is a motivating example: Say that you have the heights of the last ten American presidents…
// The heights of the last ten American presidents in cm, from Kennedy to Obama
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];
… and that you would like to fit a Bayesian model assuming a Normal distribution to this data. Well, you can do that right now by clicking “Start sampling” below! This will run an MCMC sampler in your browser implemented in JavaScript.
If this doesn’t seem to work in your browser, for some reason, then try this version of the demo.
Here is the model you just sampled from…
$$\mu \sim \text{Normal}(0, 1000) \\ \sigma \sim \text{Uniform}(0, 1000) \\ \text{heights}_i \sim \text{Normal}(\mu, \sigma) ~~~ \text{for} ~ i ~ \text{in} 1..n $$
… and this is how it is implemented in JavaScript:
/* The code below assumes that you have loaded the two modules of bayes.js:
* - mcmc.js which implements the sampler and creates the global
* object mcmc.
* - distributions.js which implements a number of log density functions
* for common probability distributions and that creates the global object
* ld (as in log density).
*/
// The data
var heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185];
// Parameter definitions
var params = {
mu: {type: "real"},
sigma: {type: "real", lower: 0}};
// Model definition
var log_post = function(state, heights) {
var log_post = 0;
// Priors (here sloppy and vague...)
log_post += ld.norm(state.mu, 0, 1000);
log_post += ld.unif(state.sigma, 0, 1000);
// Likelihood
for(var i = 0; i < heights.length; i++) {
log_post += ld.norm(heights[i], state.mu, state.sigma);
}
return log_post;
};
// Initializing the sampler, burning some draws to the MCMC gods,
// and generating a sample of size 1000.
var sampler = new mcmc.AmwgSampler(params, log_post, heights);
sampler.burn(1000);
var samples = sampler.sample(1000);
I’ve implemented a JavaScript MCMC procedure for fitting a Bayesian model before, but that was just for a specific model (I also implemented a MCMC procedure in BASIC, but don’t ask me why…). The idea with bayes.js is to make it easier for me (and maybe for you) to make demos of Bayesian procedures that are easy to put online. If you would like to know more about bayes.js just head over to it’s GitHub page where you will find the code and a README file full of details. You can also check out a couple of interactive demos that I’ve made:
- A Normal distribution
- A Bernouli distribution with a Beta prior.
- A Bernouli model with a spike-n-slab prior
- An analysis of capture-recapture data
- Correlation analysis / bivariate Normal model
- Simple linear regression
- Multivariate logistic regression
- the “Pump” example from the BUGS project
-
An hierarchical model with varying slope and intercept from Gelman and Hill (2006) (This is stretching the limits of what the simple
AmwgSampler
can do…)
These demos rely on the plotly library and I haven’t tested them extensively on different platforms/browsers. You should be able to change the data and model definition on the fly (but if you change some stuff, like adding multidimensional variables, the plotting might stop working).
What’s included in bayes.js?
The two major files in bayes.js are:
- mcmc.js - Implements a MCMC framework which can be used to fit Bayesian model with both discrete and continuous parameters. Currently the only algorithm that is implemented is a version of the adaptive Metropolis within Gibbs (
AmwgSampler
) algorithm presented by Roberts and Rosenthal (2009) . Loading this file in the browser creates the global objectmcmc
. - distributions.js - A collection of log density functions that can be used to construct Bayesian models. Follows the naming scheme
ld.*
(for example,ld.norm
andld.pois
) and uses the same parameters as thed*
density functions in R. Loading this file in the browser creates the global objectld
.
In addition to this the whole thing is wrapped in an Rstudio project as I’ve use R and JAGS to write some tests.
FAQ
- When is a JavaScript MCMC sampler useful?
- Well, for starters, it’s not particularly useful if you want to do serious Bayesian data analysis. Then you should use a serious tool like JAGS or STAN. It could, however, be useful if you would want to put a demo of a Bayesian model online, but don’t want to / can’t run the computations on a server. It could also be useful as a part of a JavaScript application making use of Bayesian computation at some point.
- How good is the sampler that bayes.js uses?
- bayes.js implements the adaptive Metropolis within Gibbs described by Roberts and Rosenthal (2009) which is a good algorithm in that (1) it’s adaptive and works out-of-the-box without you having to set a lot of tuning parameters, (2) it can handle both continuous and discrete parameters, (3) it is easy to implement. The downside with the sampler is that (1) it only works well with a small number of parameters, (2) it’s a Gibbs sampler so it’s going to struggle with correlated parameters.
- What is “a small number of parameters”?
- Depends heavily on context but <10 is probably “small” here. But depending on the model you’re trying to run you might get away with 100+ parameters.
- How fast is it?
- Also super context dependent. On simple models it’s pretty fast, for example, fitting a standard Normal model on 1000 datapoints producing a sample of 20,000 draws takes ~0.5 s. on my computer. Also, when I’ve been playing around with different browsers I’ve seen order-of-magnitude changes in performance when changing seemingly arbitrary things. For example, inlining the definition of the Normal density in the function calculating the log posterior rather than using
ld.norm
defined in distributions.js resulted in 10x slower sampling on Firefox 37.
- Also super context dependent. On simple models it’s pretty fast, for example, fitting a standard Normal model on 1000 datapoints producing a sample of 20,000 draws takes ~0.5 s. on my computer. Also, when I’ve been playing around with different browsers I’ve seen order-of-magnitude changes in performance when changing seemingly arbitrary things. For example, inlining the definition of the Normal density in the function calculating the log posterior rather than using
References
Roberts, G. O., & Rosenthal, J. S. (2009). Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2), 349-367. pdf