Lauren Talluto
03.02.2026
Dataset: Palmer penguins

Dataset: Dr. Kristen Gorman, University of Alaska (Gorman et al 2014)
R package
palmerpenguinsArtwork by @allison_horst

Dataset: Palmer penguins

Dataset: Dr. Kristen Gorman, University of Alaska (Gorman et al 2014)
R package
palmerpenguinsArtwork by @allison_horst
Dataset: Palmer penguins

Dataset: Dr. Kristen Gorman, University of Alaska (Gorman et al 2014)
R package
palmerpenguinsArtwork by @allison_horst
data(penguins, package = "palmerpenguins")
penguins = as.data.frame(penguins[complete.cases(penguins),])
penguins = subset(penguins, species == "Gentoo")
peng_dat = with(penguins, list(
x = bill_length_mm,
y = bill_depth_mm,
n = length(bill_length_mm)
))
# refresh = 0 just turns off the status display to keep the slides neat
# you can omit it in your own code!
peng_fit = sampling(penguin_lm, data = peng_dat, refresh = 0)# samples are a bit easier to deal with in a matrix
peng_samps = as.matrix(peng_fit)
head(peng_samps)
## parameters
## iterations intercept slope s lp__
## [1,] 4.327070 0.2240388 0.6671229 -26.35499
## [2,] 4.417360 0.2207002 0.6792690 -26.56142
## [3,] 4.893192 0.2111910 0.6604217 -26.76978
## [4,] 4.902920 0.2113316 0.6599114 -26.57599
## [5,] 3.651149 0.2364696 0.8081511 -26.70807
## [6,] 5.829585 0.1946857 0.7548675 -25.50562summary(lm(...))tab = data.frame(
mean = colMeans(peng_samps[, 1:3]),
median = apply(peng_samps[, 1:3], 2, median),
## need optimizing to get the posterior mode
mode = optimizing(penguin_lm, data = peng_dat)$par)| mean | median | mode | |
|---|---|---|---|
| intercept | 5.16935 | 5.15127 | 5.12090 |
| slope | 0.20658 | 0.20708 | 0.20761 |
| s | 0.75635 | 0.75220 | 0.74274 |
| intercept | slope | s | |
|---|---|---|---|
| 5.15 | 0.21 | 0.75 |

If \(\hat{\theta}\) is a set of samples approximating the posterior distribution of \(\theta\), and if some desired variable \(Z = f(\theta)\), then \(f(\hat{\theta})\) approximates the posterior distribution of \(Z\)
| intercept | slope | s | |
|---|---|---|---|
| 5.15 | 0.21 | 0.75 |

If \(\hat{\theta}\) is a set of samples approximating the posterior distribution of \(\theta\), and if some desired variable \(Z = f(\theta)\), then \(f(\hat{\theta})\) approximates the posterior distribution of \(Z\)
| intercept | slope | s | |
|---|---|---|---|
| 5.15 | 0.21 | 0.75 |

If \(\hat{\theta}\) is a set of samples approximating the posterior distribution of \(\theta\), and if some desired variable \(Z = f(\theta)\), then \(f(\hat{\theta})\) approximates the posterior distribution of \(Z\)
| intercept | slope | s | |
|---|---|---|---|
| 5.15 | 0.21 | 0.75 |


lm# predict from many x, one sample
pry = function(samp, x) samp[1] + samp[2] * x
test_x = seq(40, 60, length.out = 200)
# test_x is a vector, so result is a matrix
# each row in this matrix is the prediction for a single x
# each column a single sample
test_y = apply(peng_samps, 1, pry, x = test_x)
colnames(test_y) = paste0("samp", 1:nrow(peng_samps))
rownames(test_y) = paste0("bill_len=", round(test_x, 2))
test_y[1:3, 1:6]
## iterations
## samp1 samp2 samp3 samp4 samp5 samp6
## bill_len=40 13.28862 13.24537 13.34083 13.35618 13.10993 13.61701
## bill_len=40.1 13.31114 13.26755 13.36206 13.37742 13.13370 13.63658
## bill_len=40.2 13.33365 13.28973 13.38328 13.39866 13.15747 13.65614lmlm
# from a single sample, generate a new prediction dataset from xhat
sim = function(samp, xhat) {
eta = samp[1] + samp[2] * xhat
rnorm(length(xhat), eta, samp[3])
}
test_x = seq(40, 60, length.out = 200)
pr_test_y = matrix(0, ncol = nrow(peng_samps), nrow = length(test_x))
# for clarity, using a for loop. could (should) do this instead with mapply
for(i in 1:nrow(peng_samps))
pr_test_y[,i] = sim(peng_samps[i,], test_x)
# now get quantiles for each value in x
pr_int_y = apply(pr_test_y, 1, quantile, c(0.05, 0.95))