Julia Piaskowski
March 24, 2026
\[Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\]
\(Y_i\) = dependent variable (there is only 1)
\(X_i\) = independent variable(s) (there may be many)
\(B_0\) = model intercept, the overall mean of \(Y\)
\(B_1\) = how \(Y\) changes with \(X\)
\(\epsilon_i\) = model residual, the gap between the predicted value for \(Y_i\) and its observed value
\[\hat{Y_{i}} = \beta_0 + \beta_i X\]
\[SE(Y_i) = \mathbf{X \Sigma X^T}\]
\(\mathbf{X}\) = design matrix for X.
\(\mathbf{\Sigma}\) = variance-covariance matrix for fixed effects, \(\mathbf{\beta}\)
[A confidence interval percentage] is the frequency with which other unobserved intervals will contain the true effect….if all the assumptions used to compute the intervals were correct.
– Greenland et al, 2016
The confidence level instead reflects the long-run reliability of the method used to generate the interval….if the same sampling procedure were repeated 100 times from the same population, approximately 95 of the resulting intervals would be expected to contain the true population mean. The frequentist approach sees the true population mean as a fixed unknown constant, while the confidence interval is calculated using data from a random sample.
…a p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.
Note: for this and all other analyses in this presentation, the data and model inspected in previous lectures and found to meet linear model assumptions of normality, homoscedasticity and independence.
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
male 12.3538 4.1179 3 81 20.5460 5.393e-10 ***
female 6.4977 1.0829 6 81 5.4032 9.740e-05 ***
male:female 13.4609 0.7478 18 81 3.7312 2.266e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
male emmean SE df lower.CL upper.CL
M17 3.80 0.104 11.5 3.58 4.03
M19 2.91 0.104 11.5 2.68 3.14
M22 3.57 0.104 11.5 3.34 3.80
M58 3.31 0.104 11.5 3.08 3.54
Results are averaged over the levels of: female
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
female emmean SE df lower.CL upper.CL
F193 3.74 0.127 23.3 3.52 3.96
F195 3.28 0.127 23.3 3.06 3.50
F197 3.58 0.127 23.3 3.36 3.80
F201 3.15 0.127 23.3 2.93 3.37
F203 3.57 0.127 23.3 3.35 3.78
F204 3.01 0.127 23.3 2.79 3.23
F208 3.46 0.127 23.3 3.24 3.68
Results are averaged over the levels of: male
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.9
Cannot use mode = "kenward-roger" because *pbkrtest* package is not installed
NOTE: Results may be misleading due to involvement in interactions
female male emmean SE df lower.CL upper.CL
F197 M17 4.7225 0.2318164 74.77 4.260675 5.184325
F193 M22 4.2350 0.2318164 74.77 3.773175 4.696825
F203 M17 4.1525 0.2318164 74.77 3.690675 4.614325
F208 M17 3.9775 0.2318164 74.77 3.515675 4.439325
F203 M19 3.9750 0.2318164 74.77 3.513175 4.436825
F193 M17 3.9675 0.2318164 74.77 3.505675 4.429325
F201 M22 3.7525 0.2318164 74.77 3.290675 4.214325
F208 M22 3.6450 0.2318164 74.77 3.183175 4.106825
F193 M58 3.6300 0.2318164 74.77 3.168175 4.091825
F195 M17 3.6225 0.2318164 74.77 3.160675 4.084325
F195 M58 3.5700 0.2318164 74.77 3.108175 4.031825
F208 M58 3.5175 0.2318164 74.77 3.055675 3.979325
F197 M22 3.5150 0.2318164 74.77 3.053175 3.976825
F197 M58 3.4850 0.2318164 74.77 3.023175 3.946825
F204 M22 3.4125 0.2318164 74.77 2.950675 3.874325
F203 M22 3.2300 0.2318164 74.77 2.768175 3.691825
F204 M17 3.2050 0.2318164 74.77 2.743175 3.666825
F195 M22 3.2000 0.2318164 74.77 2.738175 3.661825
F193 M19 3.1225 0.2318164 74.77 2.660675 3.584325
F201 M58 3.1025 0.2318164 74.77 2.640675 3.564325
F201 M17 2.9825 0.2318164 74.77 2.520675 3.444325
F204 M58 2.9450 0.2318164 74.77 2.483175 3.406825
F203 M58 2.9075 0.2318164 74.77 2.445675 3.369325
F201 M19 2.7550 0.2318164 74.77 2.293175 3.216825
F195 M19 2.7375 0.2318164 74.77 2.275675 3.199325
F208 M19 2.6950 0.2318164 74.77 2.233175 3.156825
F197 M19 2.5975 0.2318164 74.77 2.135675 3.059325
F204 M19 2.4850 0.2318164 74.77 2.023175 2.946825
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
emmip() documentation
Pairwise is possible, but with 28 levels (7 female * 4 male parents) that will be very messy since that means 378 total contrasts. Note that functions for compact letter display, e.g. multcomp::cld(), are doing this in order to generate letters.
For the white pine example, there may be no need to conduct hypothesis tests, getting the estimates may be enough for parental selection.
data("bachmaier.nitrogen", package = "agridat")
bachmaier.nitrogen$nitro <- as.factor(bachmaier.nitrogen$nitro)
m2 <- lm(yield ~ nitro*zone, data = bachmaier.nitrogen)
(bach_emm1 <- emmeans(m2, ~ zone | nitro, at = list(nitro = c("0", "100", "200"))))nitro = 0:
zone emmean SE df lower.CL upper.CL
high 5.35 0.354 66 4.64 6.06
low 3.47 0.354 66 2.76 4.17
nitro = 100:
zone emmean SE df lower.CL upper.CL
high 8.68 0.354 66 7.97 9.39
low 6.91 0.354 66 6.20 7.62
nitro = 200:
zone emmean SE df lower.CL upper.CL
high 10.22 0.354 66 9.52 10.93
low 8.06 0.354 66 7.35 8.77
Confidence level used: 0.95
[1] "emmGrid"
attr(,"package")
[1] "emmeans"
nitro = 0:
contrast estimate SE df t.ratio p.value
high - low 1.88 0.501 66 3.753 0.0004
nitro = 100:
contrast estimate SE df t.ratio p.value
high - low 1.77 0.501 66 3.531 0.0008
nitro = 200:
contrast estimate SE df t.ratio p.value
high - low 2.16 0.501 66 4.312 <0.0001
| and *# pairs(emmeans(m2, ~ zone | nitro, at = list(nitro = c("0", "100", "200"))))
pairs(emmeans(m2, ~ zone*nitro, at = list(nitro = c("0", "100", "200")))) contrast estimate SE df t.ratio p.value
high nitro0 - low nitro0 1.882 0.501 66 3.753 0.0048
high nitro0 - high nitro100 -3.335 0.501 66 -6.652 <0.0001
high nitro0 - low nitro100 -1.565 0.501 66 -3.121 0.0307
high nitro0 - high nitro200 -4.876 0.501 66 -9.726 <0.0001
high nitro0 - low nitro200 -2.714 0.501 66 -5.414 <0.0001
low nitro0 - high nitro100 -5.216 0.501 66 -10.405 <0.0001
low nitro0 - low nitro100 -3.446 0.501 66 -6.875 <0.0001
low nitro0 - high nitro200 -6.757 0.501 66 -13.479 <0.0001
low nitro0 - low nitro200 -4.596 0.501 66 -9.167 <0.0001
high nitro100 - low nitro100 1.770 0.501 66 3.531 0.0095
high nitro100 - high nitro200 -1.541 0.501 66 -3.074 0.0348
high nitro100 - low nitro200 0.621 0.501 66 1.238 0.8164
low nitro100 - high nitro200 -3.311 0.501 66 -6.605 <0.0001
low nitro100 - low nitro200 -1.149 0.501 66 -2.292 0.2119
high nitro200 - low nitro200 2.162 0.501 66 4.312 0.0008
P value adjustment: tukey method for comparing a family of 6 estimates
zone = low:
nitro emmean SE df lower.CL upper.CL
0 3.47 0.354 66 2.76 4.17
80 6.27 0.354 66 5.57 6.98
100 6.91 0.354 66 6.20 7.62
120 7.37 0.354 66 6.66 8.08
140 6.93 0.354 66 6.23 7.64
160 7.81 0.354 66 7.10 8.52
180 7.13 0.354 66 6.42 7.84
200 8.06 0.354 66 7.35 8.77
220 7.92 0.354 66 7.21 8.62
240 8.28 0.354 66 7.57 8.99
260 8.06 0.354 66 7.35 8.77
Confidence level used: 0.95
zone = low:
contrast estimate SE df t.ratio p.value
nitro80 - nitro0 2.809 0.501 66 5.602 <0.0001
nitro100 - nitro80 0.638 0.501 66 1.272 0.8445
nitro120 - nitro100 0.457 0.501 66 0.911 0.9734
nitro140 - nitro120 -0.435 0.501 66 -0.867 0.9805
nitro160 - nitro140 0.877 0.501 66 1.749 0.5192
nitro180 - nitro160 -0.683 0.501 66 -1.362 0.7923
nitro200 - nitro180 0.933 0.501 66 1.861 0.4393
nitro220 - nitro200 -0.145 0.501 66 -0.289 1.0000
nitro240 - nitro220 0.362 0.501 66 0.722 0.9942
nitro260 - nitro240 -0.219 0.501 66 -0.436 0.9999
P value adjustment: mvt method for 10 tests
data(cochran.crd, package= "agridat")
cochran.crd$trt <- factor(cochran.crd$trt, levels = c("O", "F3", "F6", "F12", "S3", "S6", "S12"))
m3 <- lm(inf ~ trt, data = cochran.crd)
(inf_emm1 <-emmeans(m3, "trt")) trt emmean SE df lower.CL upper.CL
O 22.62 2.37 25 17.74 27.5
F3 9.50 3.35 25 2.60 16.4
F6 15.50 3.35 25 8.60 22.4
F12 5.75 3.35 25 -1.15 12.7
S3 16.75 3.35 25 9.85 23.7
S6 18.25 3.35 25 11.35 25.2
S12 14.25 3.35 25 7.35 21.2
Confidence level used: 0.95
contrast estimate SE df t.ratio p.value
F3 - O -13.12 4.1 25 -3.198 0.0190
F6 - O -7.12 4.1 25 -1.736 0.3475
F12 - O -16.88 4.1 25 -4.112 0.0020
S3 - O -5.88 4.1 25 -1.432 0.5203
S6 - O -4.38 4.1 25 -1.066 0.7405
S12 - O -8.38 4.1 25 -2.041 0.2127
P value adjustment: dunnettx method for 6 tests
contrast estimate SE df t.ratio p.value
F3 - avg(O,S6) -10.94 3.93 25 -2.784 0.0421
F6 - avg(O,S6) -4.94 3.93 25 -1.257 0.5853
F12 - avg(O,S6) -14.69 3.93 25 -3.738 0.0044
S3 - avg(O,S6) -3.69 3.93 25 -0.938 0.7737
S12 - avg(O,S6) -6.19 3.93 25 -1.575 0.3967
P value adjustment: dunnettx method for 5 tests
contrast estimate SE df t.ratio p.value
O effect 7.964 2.35 25 3.396 0.0160
F3 effect -5.161 3.08 25 -1.673 0.2490
F6 effect 0.839 3.08 25 0.272 0.8951
F12 effect -8.911 3.08 25 -2.889 0.0275
S3 effect 2.089 3.08 25 0.677 0.7060
S6 effect 3.589 3.08 25 1.164 0.4470
S12 effect -0.411 3.08 25 -0.133 0.8951
P value adjustment: fdr method for 7 tests
trt emmean SE df lower.CL upper.CL null t.ratio p.value
O 22.62 2.37 25 -Inf 26.7 10 5.328 1.0000
F3 9.50 3.35 25 -Inf 15.2 10 -0.149 0.4413
F6 15.50 3.35 25 -Inf 21.2 10 1.641 0.9434
F12 5.75 3.35 25 -Inf 11.5 10 -1.268 0.1082
S3 16.75 3.35 25 -Inf 22.5 10 2.014 0.9726
S6 18.25 3.35 25 -Inf 24.0 10 2.462 0.9895
S12 14.25 3.35 25 -Inf 20.0 10 1.268 0.8918
Confidence level used: 0.95
P values are left-tailed
NULL
clist = list("Spring.v.Fall" = c(0, -1/3, -1/3, -1/3, 1/3, 1/3, 1/3),
"F3.v.S3" = c(0, 1, 0, 0, -1, 0, 0),
"F12.v.Spr" = c(0, 0, 0, 1, -1/3, -1/3, -1/3))
contrast(inf_emm1, clist) contrast estimate SE df t.ratio p.value
Spring.v.Fall 6.17 2.74 25 2.254 0.0332
F3.v.S3 -7.25 4.74 25 -1.530 0.1386
F12.v.Spr -10.67 3.87 25 -2.757 0.0107
zone = low:
nitro emmean SE df lower.CL upper.CL .group
0 3.47 0.354 66 2.76 4.17 A
80 6.27 0.354 66 5.57 6.98 B
100 6.91 0.354 66 6.20 7.62 BC
140 6.93 0.354 66 6.23 7.64 BC
180 7.13 0.354 66 6.42 7.84 BC
120 7.37 0.354 66 6.66 8.08 BC
160 7.81 0.354 66 7.10 8.52 BC
220 7.92 0.354 66 7.21 8.62 BC
260 8.06 0.354 66 7.35 8.77 C
200 8.06 0.354 66 7.35 8.77 C
240 8.28 0.354 66 7.57 8.99 C
Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 11 estimates
significance level used: alpha = 0.05
NOTE: If two or more means share the same grouping symbol,
then we cannot show them to be different.
But we also did not show them to be the same.
contrast estimate SE df t.ratio p.value
M17 - M19 0.895 0.12 81 3.298 1.0000
M17 - M22 0.234 0.12 81 -2.221 0.0844
M17 - M58 0.496 0.12 81 -0.033 0.9818
M19 - M22 -0.660 0.12 81 1.340 1.0000
M19 - M58 -0.399 0.12 81 -0.848 0.7370
M22 - M58 0.262 0.12 81 -1.991 0.1406
Results are averaged over the levels of: female
Degrees-of-freedom method: kenward-roger
P value adjustment: sidak method for 6 tests
Statistics are tests of equivalence with a threshold of 0.5
P values are left-tailed
contrast estimate SE df t.ratio p.value
F197 M17 effect 1.3245 0.22 81 3.751 0.9998
F193 M22 effect 0.8370 0.22 81 1.533 0.9701
F203 M17 effect 0.7545 0.22 81 1.158 0.9421
F208 M17 effect 0.5795 0.22 81 0.362 0.7176
F203 M19 effect 0.5770 0.22 81 0.350 0.7176
F193 M17 effect 0.5695 0.22 81 0.316 0.7176
F201 M22 effect 0.3545 0.22 81 -0.662 0.3244
F208 M22 effect 0.2470 0.22 81 -1.151 0.1687
F193 M58 effect 0.2320 0.22 81 -1.219 0.1584
F195 M17 effect 0.2245 0.22 81 -1.254 0.1574
F195 M58 effect 0.1720 0.22 81 -1.492 0.1085
F208 M58 effect 0.1195 0.22 81 -1.731 0.0718
F197 M22 effect 0.1170 0.22 81 -1.743 0.0718
F197 M58 effect 0.0870 0.22 81 -1.879 0.0596
F204 M22 effect 0.0145 0.22 81 -2.209 0.0300
F203 M22 effect -0.1680 0.22 81 -3.039 0.0034
F204 M17 effect -0.1930 0.22 81 -3.153 0.0026
F195 M22 effect -0.1980 0.22 81 -3.176 0.0026
F193 M19 effect -0.2755 0.22 81 -3.528 0.0010
F201 M58 effect -0.2955 0.22 81 -3.619 0.0008
F201 M17 effect -0.4155 0.22 81 -4.165 0.0001
F204 M58 effect -0.4530 0.22 81 -4.336 <0.0001
F203 M58 effect -0.4905 0.22 81 -4.506 <0.0001
F201 M19 effect -0.6430 0.22 81 -5.200 <0.0001
F195 M19 effect -0.6605 0.22 81 -5.280 <0.0001
F208 M19 effect -0.7030 0.22 81 -5.473 <0.0001
F197 M19 effect -0.8005 0.22 81 -5.917 <0.0001
F204 M19 effect -0.9130 0.22 81 -6.428 <0.0001
Degrees-of-freedom method: kenward-roger
P value adjustment: fdr method for 28 tests
Statistics are tests of nonsuperiority with a threshold of 0.5
P values are left-tailed
data(theobald.covariate, package = "agridat")
m4 <- lmer(yield ~ gen + chu:gen + (1|gen:year), data = theobald.covariate)
l1 <- emtrends(m4, var = "chu")
summary(l1) gen chu chu.trend SE df
G01 2.55 5.65 1.46 225
G02 2.55 3.92 1.46 225
G03 2.55 3.73 1.46 225
G04 2.55 3.38 1.46 225
G05 2.55 3.64 1.76 231
G06 2.55 4.83 1.46 225
G07 2.55 4.17 1.58 222
G08 2.55 9.42 1.82 220
G09 2.55 3.22 1.57 224
G10 2.55 3.46 1.57 224
Degrees-of-freedom method: kenward-roger
[1] 2.552773
Calculated at average value for the covariate
gen emmean SE df lower.CL upper.CL
G01 6.48 0.270 32.7 5.93 7.03
G02 6.62 0.270 32.7 6.07 7.17
G03 5.70 0.270 32.7 5.15 6.25
G04 6.79 0.270 32.7 6.24 7.34
G05 6.30 0.305 34.1 5.68 6.92
G06 6.26 0.270 32.7 5.71 6.81
G07 6.62 0.303 33.1 6.01 7.24
G08 6.97 0.347 31.9 6.26 7.68
G09 6.08 0.300 31.9 5.46 6.69
G10 6.70 0.300 31.9 6.09 7.31
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
If desired, set a range of values for estimation:
gen chu emmean SE df lower.CL upper.CL
G08 2.3 4.59 0.582 152.1 3.44 5.74
G08 2.4 5.53 0.449 78.5 4.64 6.42
G08 2.5 6.47 0.362 37.3 5.74 7.21
G08 2.6 7.41 0.355 35.1 6.69 8.14
G08 2.7 8.36 0.433 71.4 7.49 9.22
G08 2.8 9.30 0.561 144.7 8.19 10.41
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
emmip() nitro zone emmean SE df lower.CL upper.CL
0 high 5.34725 0.3544908 66 4.639486 6.055014
80 high 7.98450 0.3544908 66 7.276736 8.692264
100 high 8.68200 0.3544908 66 7.974236 9.389764
120 high 8.37225 0.3544908 66 7.664486 9.080014
Confidence level used: 0.95
ggplot(bach_emm3_df, aes(x = nitro, y = emmean, group = zone, color = zone)) +
geom_line(linewidth = 0.7) +
geom_errorbar(aes(ymin = emmean-SE, ymax = emmean+SE), width = 0.15, linewidth = 1) +
ylab("yield") + xlab("nitrogen applied") +
scale_color_manual(values = c("brown", "cyan3")) + theme_bw(base_size = 18) Emmeans package documentation: comprehensive guide
Use ?summary.emmGrid for extensive information on emmeans objects including p-value adjustments
Statistical Programs’ philosophy on analysis
Data Analysis Using Regression and Multilevel/Hierarchical Models (2007) by Andrew Gelman and Jennifer Hill. This is a comprehensive and readable explanation of linear mixed models (available in the UI library).
The scientific method is the most rigorous path to knowledge, but it’s also messy and tough. Science [Statistics] deserves respect exactly because it is difficult — not because it gets everything correct on the first try. The uncertainty inherent in science doesn’t mean that we can’t use it to make important policies or decisions. It just means that we should remain cautious and adopt a mindset that’s open to changing course if new data arises. We should make the best decisions we can with the current evidence and take care not to lose sight of its strength and degree of certainty. It’s no accident that every good paper includes the phrase “more study is needed” — there is always more to learn.
– Christie Aschwanden/538, Science Isn’t Broken