The German Tank Problem

During WW2, the Allies wanted to know the rate of tank production in German factories
Known: factories stamped serial numbers on the tanks in order, starting with 1
I have captured a single tank, with (for example), the serial number \(s = 200\).
How many tanks have been produced? \((N)\)

The German Tank Problem

Knowns:

Captured serial number \(s = 200\)
Number of tanks \(N >= s\)
Assume tanks are captured completely at random, so all tanks have the same probability of capture
This sounds like a uniform distribution!
Minimum of the distribution is 1, the maximum will be N.

The German Tank Problem

Knowns:

Captured serial number \(s = 200\)
Number of tanks \(N >= s\)
Assume tanks are captured completely at random, so all tanks have the same probability of capture
This sounds like a uniform distribution!
Minimum of the distribution is 1, the maximum will be N.
Hypothesis: We got the biggest tank; \(s = N\)

The German Tank Problem

Knowns:

Captured serial number \(s = 200\)
Number of tanks \(N >= s\)
Assume tanks are captured completely at random, so all tanks have the same probability of capture
This sounds like a uniform distribution!
Minimum of the distribution is 1, the maximum will be N.
Hypothesis: We got the biggest tank; \(s = N\)

The German Tank Problem

Hypothesis 2: \(N = 400\)

The data \(s\) have not changed, just the model
All numbers are still equally likely to be captured
- But now there are twice as many possibilities
- Each individual tank is half as likely to be captured

The German Tank Problem

Hypothesis 2: \(N = 400\)

The data \(s\) have not changed, just the model
All numbers are still equally likely to be captured
- But now there are twice as many possibilities
- Each individual tank is half as likely to be captured

Hypothesis 3: \(N = 800\)

The same logic applies!

The German Tank Problem

Hypothesis 2: \(N = 400\)

The data \(s\) have not changed, just the model
All numbers are still equally likely to be captured
- But now there are twice as many possibilities
- Each individual tank is half as likely to be captured

Hypothesis 3: \(N = 800\)

The same logic applies!

… and so on.

What is the MLE? Is there a logical problem with that?

\[ s \sim Uniform(min = 1, max = N)\]

Rejection sampling

There are general algorithms for sampling from unknown distributions
- Here, the target distribution, \(t(x)\), is difficult to sample from

\[ t(x) = pr(s|N) \sim Uniform(s,N) \]

However, a uniform distribution with a fixed minimum and maximum is easy to sample from
- We will call this the proposal distribution \(p(x)\)
- We will rescale it so that it is always taller than \(t(x)\)

Rejection sampling

Algorithm

Define target distribution t(x)
Define proposal distribution p(x)
Draw a random sample \(y\) from p(x)
Compute acceptance probability \(r = \frac{t(y)}{p(y)}\)
Accept the sample with probability \(r\), reject with probability \(1-r\)
Repeat until desired number of samples obtained

Rejection sampling

Algorithm

Define target distribution t(x)
Define proposal distribution p(x)
Draw a random sample \(y\) from p(x)
Compute acceptance probability \(r = \frac{t(y)}{p(y)}\)
Accept the sample with probability \(r\), reject with probability \(1-r\)
Repeat until desired number of samples obtained

This is incredibly inefficient

s = 200
target = function(x) dunif(s, 1, x)
proposal = function(x) dunif(x, 1, 1e6)

## this scale will make sure p(x) >= t(x)
scale = target(s) / proposal(s) 
candidates = as.integer(runif(1e6, 1, 1e6))
r = target(candidates)/(proposal(candidates) * scale)
## Warning in dunif(s, 1, x): NaNs produced


## uniform numbers between 0 and 1
test = runif(length(r))  

## TRUE if r > test; WHY?
accept = ifelse(r > test, TRUE, FALSE) 
samples = candidates[accept]

Markov Chains

Markov chains are defined by a state vector \(S\)
- In this case, the value of \(S\) represents some parameter
- For a model with \(k\) parameters, \(S\) is a matrix with \(k\) columns
The model is stochastic and has a memory
- Moves via random walk: \(S_{t+1} = f(S_t)\)
- Chain must be recurrent: it must be possible to (eventually) reach any possible value from any other possible value
Markov models are commonly used to model stochastic processes happening in discrete time steps (e.g., population growth)

\[ S_t = S_{t-1} + fecundity \times S_{t-1} - mortality \times S_{t-1} \]

Markov Chain Monte Carlo

MCMC applies Markov chains to increase the acceptance rate of rejection sampling
- With rejection sampling we jump randomly, anywhere in the state space
- With MCMC, we take very small steps in space, centered around the most recently accepted value
- Target acceptance rates of 20-50 %
Individual samples are not independent
Run for long enough, we can approximate the shape of the posterior