MCMC is a great framework for sampling posterior distributions of complex statistical models. In order to sample from these distribution, the markov chains need to reach stationarity. This condition is often computationally intensive and (as far as I know) there is no way to fasten this stage with the inclusion of independent chains (that can be parallelised).
It could be, however, that once we reach stationarity, we can sample from independent chains simultaneously using threads or multicore processors or even in independent computers. The way to do this is described in the following example:
from pystan import StanModel from pystan.constants import MAX_UINT # bernoulli model model_code = """ data { int<lower=0> N; int<lower=0,upper=1> y[N]; } parameters { real<lower=0,upper=1> theta; } model { theta ~ beta(0.5, 0.5); // Jeffreys' prior for (n in 1:N) y[n] ~ bernoulli(theta); } """ data = dict(N=10, y=[0, 1, 0, 1, 0, 1, 0, 1, 1, 1]) sm = StanModel(model_code=model_code) # initial seed can also be chosen by user # MAX_UINT = 2147483647 seed = np.random.randint(0, MAX_UINT, size=1) fit = sm.sampling(data=data, seed=seed) # reuse tuned parameters stepsize = fit.get_stepsize() # by default .get_inv_metric returns a list inv_metric = fit.get_inv_metric(as_dict=True) init = fit.get_last_position() # increment seed by 1 seed2 = seed + 1 control = {"stepsize" : stepsize, "inv_metric" : inv_metric, "adapt_engaged" : False } fit2 = sm.sampling(data=data, warmup=0, iter=1000, control=control, init=init, seed=seed2)
This example was taken from here .
Stan supports independent chains in parallel out of the box, so why not use that? Each chain needs to warm up independently though (probably a good thing).
Hi Peter,
Indeed, STAN supports parallel chains in multicore (threads) processors and it is very useful for fast convergence. However, this doesn’t stop you from saving and resuming an MCMC chain(s) for future use.
I can share with you the motivation behind this post.
I was fitting a model that used a large volume of data. This made the sampling process slow and impossible to run the thousands of iterations that I wanted. To achieve this, I got access to a high performance computer (HPC). However, I could only use it for a short limited time.
What I did was to run the burnin sample in the HPC until I considered it reached stationarity. After that I migrated the model (as described above) and run it in my computer for sampling the effective posterior.
Another example is in gaussian processes. Suppose that you fit the model at a time t and then, days later, you want to perform some predictions conditional on the fitted observations.
You can load the model and sample the conditional distribution without fitting again the model (which usually takes a toll). Moreover, you can share this “premodel” with other users and they can perform different predictions.
I suppose there are other cases where saving and resuming chains and models is useful.
Thanks a lot for your comment.
Hi there,
I’ve been facing a similar challenge: Trying to check whether my model as converged, and proceed to return the results to the user, or not, and perform some more steps. This also means that, as you pointed out, the user can also resume the chain(s) later in a different machine, which is also what I wanted, and a pretty cool feature to have.
However this guide seems to be using the now deprecated pystan2, which seems to have more functionalities than the current version of pystan3.
How should one approach this problem in pystan3 to achieve similar behavior?
Also, even though pystan2 is no longer supported, considering all of its additional features, are there any current or possibly future issues arising from the usage of this package?
Thanks in advance,
José Ferreira
Hola José,
Thanks a lot for reaching out and sorry for the long delay in my response.
You’re right. At the time of writing this post I was working with PyStan 2.x., no idea that PyStan 3 was on its way.
Honestly, I haven’t checked this feature in PStan3. Indeed, I’ve found it quite annoying to migrate my old code to the new implementation.
I’ll make let you know of any advancements relating to this matter (via this blog).
Cheers,
Juan