Linear Model Outputs: What Does It All Mean?

Julia Piaskowski

March 10, 2026

Linear Model Review

\[Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\]

\(Y_i\) = dependent variable (there is only 1)

\(X_i\) = independent variable(s) (there may be many)

\(B_0\) = model intercept, the overall mean of \(Y\)

\(B_1\) = how \(Y\) changes with \(X\)

\(\epsilon_i\) = model residual, the gap between the predicted value for \(Y_i\) and its observed value

The Residual Is Important

\[ \epsilon_i \sim N(0, \sigma)\]

The residuals are normally distributed with a mean of zero and some standard deviation that we will estimate during the model-fitting process.
All model residuals are independent and identically distributed, “i.i.d.” – they are uncorrelated with each other and drawn from the same distribution (that has exactly one variance)

Some Notes on Estimation

\[\hat{Y} = \beta_0 + \beta_1 X1 + \beta_2 X2 + ...\beta_k Xk\]

\[\hat{Y} = \mathbf{XB} \]

\(\hat{Y}\) = predicted/fitted value
\(\beta_0\) = model intercept
\(\beta_1, \beta_2,...\) = model coefficients

Some Notes on Estimation

Maximum likelihood Estimation (MLE)

How most frequentist statistical equations are “solved”
Iterative approach, find the most probable set of model parameters (coefficients and variance terms) that maximizes a likelihood function
If a data set is perfectly balanced, maximum likelihood has the same results as ordinary least squares
REML (restricted maximum likelihood) is a special type of MLE for mixed models.
ordinary least squares (OLS) is the original method, but it’s less flexible than MLE

Definitions

P-value

…a p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.

– American Statistical Association

Confidence Interval

[A confidence interval percentage] is the frequency with which other unobserved intervals will contain the true effect….if all the assumptions used to compute the intervals were correct.

– Greenland et al, 2016

The confidence level instead reflects the long-run reliability of the method used to generate the interval….if the same sampling procedure were repeated 100 times from the same population, approximately 95 of the resulting intervals would be expected to contain the true population mean. The frequentist approach sees the true population mean as a fixed unknown constant, while the confidence interval is calculated using data from a random sample.

– Wikipedia

Regression

Regression Example

Simple linear regression: one continuous independent variable

data(anscombe); head(anscombe, n = 4)

  x1 x2 x3 x4   y1   y2    y3   y4
1 10 10 10  8 8.04 9.14  7.46 6.58
2  8  8  8  8 6.95 8.14  6.77 5.76
3 13 13 13  8 7.58 8.74 12.74 7.71
4  9  9  9  8 8.81 8.77  7.11 8.84

Regression Example

m1 <- lm(y1 ~ x1, data = anscombe)
plot(anscombe$x1, anscombe$y1, pch = 21, bg = "blue3"); abline(3, 0.5)

Regression Example

print(m1) is a pretty printing of some output
summary(m1) is a function with pre-selected and formatted output

(s1 <- summary(m1))


Call:
lm(formula = y1 ~ x1, data = anscombe)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.92127 -0.45577 -0.04136  0.70941  1.83882 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   3.0001     1.1247   2.667  0.02573 * 
x1            0.5001     0.1179   4.241  0.00217 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.237 on 9 degrees of freedom
Multiple R-squared:  0.6665,    Adjusted R-squared:  0.6295 
F-statistic: 17.99 on 1 and 9 DF,  p-value: 0.00217

Regression Output

Model coefficients

s1$coefficients

             Estimate Std. Error  t value    Pr(>|t|)
(Intercept) 3.0000909  1.1247468 2.667348 0.025734051
x1          0.5000909  0.1179055 4.241455 0.002169629

the “(Intercept)” is \(\beta_0\) and “x1” is \(\beta_1\)
the Standard error of the estimate is given
the t value is from the student’s t-distribution
P-value is evaluating if the estimate is different from zero

Regression Output

Other valuable output

s1$sigma # standard deviation of the residuals

[1] 1.236603

s1$r.squared # model R^2

[1] 0.6665425

s1$adj.r.squared # adjusted model R^2

[1] 0.6294916

\(R^2_{Adj} = 1 - \frac {(1-R^2)(1-n)} {n - k - 1}\)

Regression Output

Other valuable output: F-statistic
- overall indicator of model fit
- comparing to a null model: y ~ 1 (intercept-only model)

s1$fstatistic

   value    numdf    dendf 
17.98994  1.00000  9.00000

pf(s1$fstatistic[1], s1$fstatistic[2], s1$fstatistic[3], lower.tail = FALSE)

      value 
0.002169629

Other Regression Output

s1$call # model reminder
s1$term # function data conditining
s1$residuals # raw residuals
s1$df # degrees of freedom
s1$aliased # check that no variables are linearly dependent
s1$fstatistic # model F-test

(I tend not to need these unless doing something very specific)

More Regression Output

From the model object

coef(m1)

(Intercept)          x1 
  3.0000909   0.5000909

residuals(m1)

          1           2           3           4           5           6 
 0.03900000 -0.05081818 -1.92127273  1.30909091 -0.17109091 -0.04136364 
          7           8           9          10          11 
 1.23936364 -0.74045455  1.83881818 -1.68072727  0.17945455

Possible Regression Output

class(m1)

[1] "lm"

methods(class = "lm")

 [1] add1           alias          anova          case.names     coerce        
 [6] confint        cooks.distance deviance       dfbeta         dfbetas       
[11] drop1          dummy.coef     effects        extractAIC     family        
[16] formula        fortify        hatvalues      influence      initialize    
[21] kappa          labels         logLik         model.frame    model.matrix  
[26] nobs           plot           predict        print          proj          
[31] qqnorm         qr             residuals      rstandard      rstudent      
[36] show           simulate       slotsFromS3    summary        variable.names
[41] vcov          
see '?methods' for accessing help and source code

These are built-in methods for this class of object, and are dependent on the libraries loaded

Possible Regression Output

confint() provides confidence interval of parameters
extractAIC() and logLik() return fit statistics
hatvalues(), dfbetas(), cooks.distance(), and influence() are model diagnostics tools
drop() and add1() drop/add terms sequentially and evaluate model fit statistics (not applicable for simple linear regression)
plot() provides diagnostic plots (= plot.lm())
simulate() simulates data under the estimated model parameters
residuals(), rstandard(), rstudent() produce raw, Pearson and studentized residuals, respectively
predict() is for predicting a new data set

Regression Output

Confidence Intervals

confint(m1)

                2.5 %    97.5 %
(Intercept) 0.4557369 5.5444449
x1          0.2333701 0.7668117

confint(m1, level = 0.90)

                  5 %     95 %
(Intercept) 0.9383030 5.061879
x1          0.2839568 0.716225

Regression Output

Influential data points

influence(m1)

$hat
         1          2          3          4          5          6          7 
0.10000000 0.10000000 0.23636364 0.09090909 0.12727273 0.31818182 0.17272727 
         8          9         10         11 
0.31818182 0.17272727 0.12727273 0.23636364 

$coefficients
     (Intercept)            x1
1   0.0003939394  3.939394e-04
2  -0.0097529844  5.133150e-04
3   0.5946796537 -9.148918e-02
4   0.1309090909 -4.763503e-19
5   0.0142575758 -3.564394e-03
6   0.0193030303 -2.757576e-03
7   0.5039170829 -4.085814e-02
8  -0.5430000000  4.936364e-02
9  -0.3435154845  6.062038e-02
10 -0.4902121212  3.501515e-02
11  0.0982727273 -8.545455e-03

$sigma
       1        2        3        4        5        6        7        8 
1.311535 1.311479 1.056460 1.218483 1.310017 1.311496 1.219936 1.272721 
       9       10       11 
1.099742 1.147055 1.309605 

$wt.res
          1           2           3           4           5           6 
 0.03900000 -0.05081818 -1.92127273  1.30909091 -0.17109091 -0.04136364 
          7           8           9          10          11 
 1.23936364 -0.74045455  1.83881818 -1.68072727  0.17945455

Influence Measures

How much single data points influence model parmaters

DF fits: change in fits
DF betas: change in parameters
PRESS statistic: leave-one-out predicted sum of squares
hat values, Cook’s D(istance): X values that are extreme

(DF refers to difference)

\(R^2\) versus \(r\)

\(R^2\) coefficient of determination

Image Source

\(R^2\) versus \(r\)

\(R^2\) coefficient of determination \[R^2 = 1 - \frac {SS_{error}}{SS_{reg} + SS_{errror}} = \frac {SS_{reg}}{SS_{reg} + SS_{errror}}\]

\[0 \leq R^2 \leq 1\]

For measuring the strength of a regression
\(R^2\) exists, \(R\) does not

\(R^2\) versus \(r\)

\(r\) coefficient of correlation:

\[r_{xy} = \frac{s_{xy}}{s_x s_y}\]

\[-1 \leq r \leq 1\]

For understanding pairwise relationships
\(r^2\) = \(R^2\) only in the case of simple linear regression

Multiple Linear Regression

Several continous variables effecting differences in a continuous independent variable

data(mtcars)
head(mtcars, n = 4)

                mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4      21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

m2 <- lm(mpg ~ disp + hp + drat, data = mtcars)

Multiple Linear Regression

summary(m2)


Call:
lm(formula = mpg ~ disp + hp + drat, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.1225 -1.8454 -0.4456  1.1342  6.4958 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept) 19.344293   6.370882   3.036  0.00513 **
disp        -0.019232   0.009371  -2.052  0.04960 * 
hp          -0.031229   0.013345  -2.340  0.02663 * 
drat         2.714975   1.487366   1.825  0.07863 . 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.008 on 28 degrees of freedom
Multiple R-squared:  0.775, Adjusted R-squared:  0.7509 
F-statistic: 32.15 on 3 and 28 DF,  p-value: 3.28e-09

Multiple Linear Regression

drop1(m2)

Single term deletions

Model:
mpg ~ disp + hp + drat
       Df Sum of Sq    RSS    AIC
<none>              253.35 74.209
disp    1    38.107 291.45 76.693
hp      1    49.550 302.90 77.925
drat    1    30.148 283.49 75.806

ANOVA

(Regression with categorical variables)

ANOVA

data(npk)
head(npk, n = 10)

   block N P K yield
1      1 0 1 1  49.5
2      1 1 1 0  62.8
3      1 0 0 0  46.8
4      1 1 0 1  57.0
5      2 1 0 0  59.8
6      2 1 1 1  58.5
7      2 0 0 1  55.5
8      2 0 1 0  56.0
9      3 0 1 0  62.8
10     3 1 1 1  55.8

ANOVA

(ignoring block)

N: 2 levels, 0 and 1
P: 2 levels, 0 and 1
K: 2 levels, 0 and 1

\[ Y_{ijk+} = \beta_0 + \beta_1N +\beta_2P + \beta_3K \]

ANOVA

(with block)

6 levels of block
block 1: 0 only
block 2: 0 and 1
block 3: 0 and 1
….(blocks 4, 5, 6): 0 and 1

(set one level as level zero or reference level)

ID	block 2	block 3	block 4	block 5	block 6
A	1	0	0	0	0
B	0	1	0	0	0
C	0	0	1	0	0
D	0	0	0	1	0
E	0	0	0	0	1

\[ Y_{npk} = \beta_0 +...+ \beta_4 Bl2 + \beta_5 Bl3 + \beta_6 Bl4 + \beta_7 Bl5 + \beta_8 Bl6\]

ANOVA

m3 <- lm(yield ~N + P + K + block, data = npk)
summary(m3)


Call:
lm(formula = yield ~ N + P + K + block, data = npk)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.0000 -1.7083 -0.0833  2.2458  6.4833 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   53.800      2.450  21.955 8.13e-13 ***
N1             5.617      1.634   3.438  0.00366 ** 
P1            -1.183      1.634  -0.724  0.47999    
K1            -3.983      1.634  -2.438  0.02767 *  
block2         3.425      2.830   1.210  0.24483    
block3         6.750      2.830   2.386  0.03068 *  
block4        -3.900      2.830  -1.378  0.18831    
block5        -3.500      2.830  -1.237  0.23512    
block6         2.325      2.830   0.822  0.42412    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.002 on 15 degrees of freedom
Multiple R-squared:  0.7259,    Adjusted R-squared:  0.5798 
F-statistic: 4.966 on 8 and 15 DF,  p-value: 0.003761

ANOVA

Type 1 sums of squares

anova(m3)

Analysis of Variance Table

Response: yield
          Df Sum Sq Mean Sq F value  Pr(>F)   
N          1 189.28 189.282 11.8210 0.00366 **
P          1   8.40   8.402  0.5247 0.47999   
K          1  95.20  95.202  5.9455 0.02767 * 
block      5 343.29  68.659  4.2879 0.01272 * 
Residuals 15 240.19  16.012                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA

Type 3 sums of squares

drop1(m3, test = "F")

Single term deletions

Model:
yield ~ N + P + K + block
       Df Sum of Sq    RSS    AIC F value  Pr(>F)   
<none>              240.19 73.281                   
N       1    189.28 429.47 85.228 11.8210 0.00366 **
P       1      8.40 248.59 72.106  0.5247 0.47999   
K       1     95.20 335.39 79.294  5.9455 0.02767 * 
block   5    343.29 583.48 84.583  4.2879 0.01272 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sums of Squares

Type 1: depends on order of independent variables
Type 3: best for factorials
Type 2: use if you are sure there are no interactions

These only matter if the data are unbalanced

Decent explanation

Mixed Models

library(lme4); library(lmerTest)
m4 <- lmer(yield ~N + P + K + (1|block), data = npk)
m4

Linear mixed model fit by REML ['lmerModLmerTest']
Formula: yield ~ N + P + K + (1 | block)
   Data: npk
REML criterion at convergence: 128.057
Random effects:
 Groups   Name        Std.Dev.
 block    (Intercept) 3.628   
 Residual             4.002   
Number of obs: 24, groups:  block, 6
Fixed Effects:
(Intercept)           N1           P1           K1  
     54.650        5.617       -1.183       -3.983

Mixed Models

fixef(m4) # fixed effects

(Intercept)          N1          P1          K1 
  54.650000    5.616667   -1.183333   -3.983333

ranef(m4) # random effects

$block
  (Intercept)
1   -0.651767
2    1.974470
3    4.524029
4   -3.642227
5   -3.335513
6    1.131007

with conditional variances for "block"

Mixed Models

sigma(m4)

[1] 4.001541

print(VarCorr(m4), comp = c("Variance", "Std.Dev"))

 Groups   Name        Variance Std.Dev.
 block    (Intercept) 13.162   3.6279  
 Residual             16.012   4.0015

confint(m4)

                2.5 %     97.5 %
.sig01       1.201048  7.3624368
.sigma       2.720672  5.2696730
(Intercept) 50.396060 58.9039408
N1           2.530694  8.7026395
P1          -4.269306  1.9026395
K1          -7.069306 -0.8973605

Mixed Models

A large number of functions are providing model conditions, not something estimated.

family(m4) # distribution (Gaussian, etc)
na.action(m4) # how missing data were handled
isREML(m4) # yes/no was REML used?
formula(m4) # formula used
getData(m4) # return data set 
nobs(m4) # data set size
ngrps(m4) # number of random groups

Mixed Models

So many methods

methods(class = "merMod")

 [1] anova          as.function    coef           confint        cooks.distance
 [6] deviance       df.residual    drop1          extractAIC     family        
[11] fitted         fixef          formula        fortify        getData       
[16] getL           getME          hatvalues      influence      isGLMM        
[21] isLMM          isNLMM         isREML         isSingular     logLik        
[26] model.frame    model.matrix   na.action      ngrps          nobs          
[31] plot           predict        print          profile        ranef         
[36] refit          refitML        rePCA          residuals      rstudent      
[41] show           sigma          simulate       summary        terms         
[46] update         VarCorr        vcov           weights       
see '?methods' for accessing help and source code

methods(class = "lmerModLmerTest")

 [1] anova       coerce      coerce<-    contest     contest1D   contestMD  
 [7] difflsmeans drop1       getL        isSingular  ls_means    lsmeansLT  
[13] show        step        summary     update     
see '?methods' for accessing help and source code

Mixed Models

Test random effects with a log likihood ratio test

ranova(m4)

ANOVA-like table for random-effects: Single term deletions

Model:
yield ~ N + P + K + (1 | block)
            npar  logLik    AIC    LRT Df Pr(>Chisq)  
<none>         6 -64.029 140.06                       
(1 | block)    5 -66.388 142.78 4.7194  1    0.02982 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Final Thoughts

\(R^2\), log likelihood, and AIC/BIC are ways to evaluate model fit
the coefficient for regression can be directly interpreted; ANOVA coefficients are more complex
Use type 3 sums of squares if you’re unclear
linear modelling objects have a large amount of output, use methods() to explore this content
ANOVA is a special case of regression; they are both linear models

Additional Resources

An R Companion to Applied Regression (2019), \(3^{rd}\) Ed., by John Fox and Sanford Weisberg.
Learn more about functions for detecting influential data
SAS User Guide to Linear Models
Linear Mixed Model Guide: details on implementation of linear mixed models in R