See the bottom of this page for some scaffolding code to get you started. Fill in the appropriate places with the code for making the algorithm work. Alternatively, there is an implementation provided on the github repository.
Recall the German tank problem presented in lecture. Use the following captured serial numbers:
s = c(147, 126, 183, 88, 9, 203, 16, 10, 112, 205)
Your goal is to estimate a single parameter, \(N\), the highest possible serial number (indicating the number of tanks actually produced).
a
and b
above into a Stan
statement for the model
block. It will look something like
s ~ ...
a
.optimizing
function. What’s the MAP estimate?
vector
datatype, which
we haven’t seen yet. Look it up in the Stan manual to see if you can
understand how and where to use it.Use sampling()
to get 5000 samples from the posterior
distribution. Alternatively, if you finished part 1 above, try this with
your own sampler.
mcmc_trace()
and
mcmc_hist()
from the bayesplot
package (or
implement your own versions). You might need to use
as.array
or as.matrix
to convert the samples
from stan to something that bayesplot can use. Compare the histogram of
samples to the posterior density plot you made in 2d.You probably have produced a model in 2 that treats N as a continuous varible, resulting of course in estimates that say something like “1457.3 tanks were produced.” This is of course impossible, \(N\) and \(s\) are both discrete parameters. Can you design a model that respects this constraint? How do the results differ?
#' Simple single-parameter metropolis algorithm
#' @param target Target function (returning log unnormalised posterior density);
#' this function should take the parameter as its first argument and a data list as its second
#' @param initial Initial value of the parameter
#' @param data Data to pass to the target
#' @param iter Number of iterations
#' @param scale Scale for the proposal distribution; defaults to 1
#'
#' @return A list, with three components: 'chain' is the markov chain, 'scale'
#' is the scale parameter used, and 'accept' is the acceptance rate
metropolis = function(target, initial, data, iter = 5000, scale = 1) {
##### OPTIONAL
## here, you can run an adaptation phase to set the scale.
## The steps should be a repeat of everything below
## The only addition: if you accept the proposal: scale = scale * 1.1 (or some other constant)
## if you reject: scale = scale / 1.1
## At the end of adaptation, discard the chain, you can't use those samples
# set up the markov chain
# here we preallocate a vector to hold the state of the chain
chain = numeric(iter)
# it is important to keep track of how many times we accept the proposals
# the acceptance rate is an important diagnostic
accept = 0
# the first step in the chain gets initial values
chain[1] = initial
for(i in 2:iter) {
## STEPS FOR THE ALGORITHM
## 1. generate a proposal for chain[i]
## this proposal should be draw from a proposal distribution centred around
## chain[i-1] and using the scale to determine how wide the distribution is
##
## 2. Compute the acceptance probability of the proposal
## remember that this is the ratio of the probabilities from the target distribution
## target(proposal, data)/target(chain[i-1], data)
## If your target returns a log probability (it should), then you need to convert
## from log-scale to probability scale
##
## 3. Do a bernoulli trial - on a success, accept the proposal, on a failure, reject it
##
## 4. Save the result; if you accepted, chain[i] gets the proposal.
## If not, chain[i] will be the same as chain[i-1]. Don't forget to track acceptances.
}
return(list(chain = chain, accept = accept/iter, scale = scale))
}
log_posterior = function(params, data) {
## fill in a log posterior for the problem you are working here
}
# fill in initial values, data, and the guess at the scale
## fit = metropolis(log_posterior, initial = , data = , scale = )