Linear Model Outputs: What Does It All Mean?



Julia Piaskowski

March 10, 2026

Linear Model Review

\[Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\]

\(Y_i\) = dependent variable (there is only 1)

\(X_i\) = independent variable(s) (there may be many)

\(B_0\) = model intercept, the overall mean of \(Y\)

\(B_1\) = how \(Y\) changes with \(X\)

\(\epsilon_i\) = model residual, the gap between the predicted value for \(Y_i\) and its observed value

The Residual Is Important

\[ \epsilon_i \sim N(0, \sigma)\]

  • The residuals are normally distributed with a mean of zero and some standard deviation that we will estimate during the model-fitting process.

  • All model residuals are independent and identically distributed, “i.i.d.” – they are uncorrelated with each other and drawn from the same distribution (that has exactly one variance)

Some Notes on Estimation

\[\hat{Y} = \beta_0 + \beta_1 X1 + \beta_2 X2 + ...\beta_k Xk\]

\[\hat{Y} = \mathbf{XB} \]

\(\hat{Y}\) = predicted/fitted value
\(\beta_0\) = model intercept
\(\beta_1, \beta_2,...\) = model coefficients

Some Notes on Estimation

Maximum likelihood Estimation (MLE)

  • How most frequentist statistical equations are “solved”
  • Iterative approach, find the most probable set of model parameters (coefficients and variance terms) that maximizes a likelihood function
  • If a data set is perfectly balanced, maximum likelihood has the same results as ordinary least squares
  • REML (restricted maximum likelihood) is a special type of MLE for mixed models.
  • ordinary least squares (OLS) is the original method, but it’s less flexible than MLE

Definitions

P-value

…a p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.


Confidence Interval

[A confidence interval percentage] is the frequency with which other unobserved intervals will contain the true effect….if all the assumptions used to compute the intervals were correct.


– Greenland et al, 2016

The confidence level instead reflects the long-run reliability of the method used to generate the interval….if the same sampling procedure were repeated 100 times from the same population, approximately 95 of the resulting intervals would be expected to contain the true population mean. The frequentist approach sees the true population mean as a fixed unknown constant, while the confidence interval is calculated using data from a random sample.


Wikipedia

Regression

Regression Example

Simple linear regression: one continuous independent variable

data(anscombe); head(anscombe, n = 4)
  x1 x2 x3 x4   y1   y2    y3   y4
1 10 10 10  8 8.04 9.14  7.46 6.58
2  8  8  8  8 6.95 8.14  6.77 5.76
3 13 13 13  8 7.58 8.74 12.74 7.71
4  9  9  9  8 8.81 8.77  7.11 8.84

Regression Example

m1 <- lm(y1 ~ x1, data = anscombe)
plot(anscombe$x1, anscombe$y1, pch = 21, bg = "blue3"); abline(3, 0.5)

Regression Example

  • print(m1) is a pretty printing of some output
  • summary(m1) is a function with pre-selected and formatted output
(s1 <- summary(m1))

Call:
lm(formula = y1 ~ x1, data = anscombe)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.92127 -0.45577 -0.04136  0.70941  1.83882 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   3.0001     1.1247   2.667  0.02573 * 
x1            0.5001     0.1179   4.241  0.00217 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.237 on 9 degrees of freedom
Multiple R-squared:  0.6665,    Adjusted R-squared:  0.6295 
F-statistic: 17.99 on 1 and 9 DF,  p-value: 0.00217

Regression Output

Model coefficients

s1$coefficients
             Estimate Std. Error  t value    Pr(>|t|)
(Intercept) 3.0000909  1.1247468 2.667348 0.025734051
x1          0.5000909  0.1179055 4.241455 0.002169629
  • the “(Intercept)” is \(\beta_0\) and “x1” is \(\beta_1\)
  • the Standard error of the estimate is given
  • the t value is from the student’s t-distribution
  • P-value is evaluating if the estimate is different from zero

Regression Output

Other valuable output

s1$sigma # standard deviation of the residuals
[1] 1.236603
s1$r.squared # model R^2
[1] 0.6665425
s1$adj.r.squared # adjusted model R^2
[1] 0.6294916

\(R^2_{Adj} = 1 - \frac {(1-R^2)(1-n)} {n - k - 1}\)

Regression Output

Other valuable output: F-statistic
- overall indicator of model fit
- comparing to a null model: y ~ 1 (intercept-only model)

s1$fstatistic
   value    numdf    dendf 
17.98994  1.00000  9.00000 
pf(s1$fstatistic[1], s1$fstatistic[2], s1$fstatistic[3], lower.tail = FALSE)
      value 
0.002169629 

Other Regression Output

s1$call # model reminder
s1$term # function data conditining
s1$residuals # raw residuals
s1$df # degrees of freedom
s1$aliased # check that no variables are linearly dependent
s1$fstatistic # model F-test

(I tend not to need these unless doing something very specific)

More Regression Output

From the model object

coef(m1)
(Intercept)          x1 
  3.0000909   0.5000909 
residuals(m1)
          1           2           3           4           5           6 
 0.03900000 -0.05081818 -1.92127273  1.30909091 -0.17109091 -0.04136364 
          7           8           9          10          11 
 1.23936364 -0.74045455  1.83881818 -1.68072727  0.17945455 

Possible Regression Output

class(m1)
[1] "lm"
methods(class = "lm")
 [1] add1           alias          anova          case.names     coerce        
 [6] confint        cooks.distance deviance       dfbeta         dfbetas       
[11] drop1          dummy.coef     effects        extractAIC     family        
[16] formula        fortify        hatvalues      influence      initialize    
[21] kappa          labels         logLik         model.frame    model.matrix  
[26] nobs           plot           predict        print          proj          
[31] qqnorm         qr             residuals      rstandard      rstudent      
[36] show           simulate       slotsFromS3    summary        variable.names
[41] vcov          
see '?methods' for accessing help and source code

These are built-in methods for this class of object, and are dependent on the libraries loaded

Possible Regression Output

  • confint() provides confidence interval of parameters
  • extractAIC() and logLik() return fit statistics
  • hatvalues(), dfbetas(), cooks.distance(), and influence() are model diagnostics tools
  • drop() and add1() drop/add terms sequentially and evaluate model fit statistics (not applicable for simple linear regression)
  • plot() provides diagnostic plots (= plot.lm())
  • simulate() simulates data under the estimated model parameters
  • residuals(), rstandard(), rstudent() produce raw, Pearson and studentized residuals, respectively
  • predict() is for predicting a new data set

Regression Output

Confidence Intervals

confint(m1)
                2.5 %    97.5 %
(Intercept) 0.4557369 5.5444449
x1          0.2333701 0.7668117
confint(m1, level = 0.90)
                  5 %     95 %
(Intercept) 0.9383030 5.061879
x1          0.2839568 0.716225

Regression Output

Influential data points

influence(m1)
$hat
         1          2          3          4          5          6          7 
0.10000000 0.10000000 0.23636364 0.09090909 0.12727273 0.31818182 0.17272727 
         8          9         10         11 
0.31818182 0.17272727 0.12727273 0.23636364 

$coefficients
     (Intercept)            x1
1   0.0003939394  3.939394e-04
2  -0.0097529844  5.133150e-04
3   0.5946796537 -9.148918e-02
4   0.1309090909 -4.763503e-19
5   0.0142575758 -3.564394e-03
6   0.0193030303 -2.757576e-03
7   0.5039170829 -4.085814e-02
8  -0.5430000000  4.936364e-02
9  -0.3435154845  6.062038e-02
10 -0.4902121212  3.501515e-02
11  0.0982727273 -8.545455e-03

$sigma
       1        2        3        4        5        6        7        8 
1.311535 1.311479 1.056460 1.218483 1.310017 1.311496 1.219936 1.272721 
       9       10       11 
1.099742 1.147055 1.309605 

$wt.res
          1           2           3           4           5           6 
 0.03900000 -0.05081818 -1.92127273  1.30909091 -0.17109091 -0.04136364 
          7           8           9          10          11 
 1.23936364 -0.74045455  1.83881818 -1.68072727  0.17945455 

Influence Measures

How much single data points influence model parmaters

(DF refers to difference)

\(R^2\) versus \(r\)

  • \(R^2\) coefficient of determination

\(R^2\) versus \(r\)

\(R^2\) coefficient of determination \[R^2 = 1 - \frac {SS_{error}}{SS_{reg} + SS_{errror}} = \frac {SS_{reg}}{SS_{reg} + SS_{errror}}\]

\[0 \leq R^2 \leq 1\]

  • For measuring the strength of a regression

  • \(R^2\) exists, \(R\) does not

\(R^2\) versus \(r\)

\(r\) coefficient of correlation:

\[r_{xy} = \frac{s_{xy}}{s_x s_y}\]

\[-1 \leq r \leq 1\]

  • For understanding pairwise relationships

  • \(r^2\) = \(R^2\) only in the case of simple linear regression

Multiple Linear Regression

Several continous variables effecting differences in a continuous independent variable

data(mtcars)
head(mtcars, n = 4)
                mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4      21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710     22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
m2 <- lm(mpg ~ disp + hp + drat, data = mtcars)

Multiple Linear Regression

summary(m2)

Call:
lm(formula = mpg ~ disp + hp + drat, data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.1225 -1.8454 -0.4456  1.1342  6.4958 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept) 19.344293   6.370882   3.036  0.00513 **
disp        -0.019232   0.009371  -2.052  0.04960 * 
hp          -0.031229   0.013345  -2.340  0.02663 * 
drat         2.714975   1.487366   1.825  0.07863 . 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.008 on 28 degrees of freedom
Multiple R-squared:  0.775, Adjusted R-squared:  0.7509 
F-statistic: 32.15 on 3 and 28 DF,  p-value: 3.28e-09

Multiple Linear Regression

drop1(m2)
Single term deletions

Model:
mpg ~ disp + hp + drat
       Df Sum of Sq    RSS    AIC
<none>              253.35 74.209
disp    1    38.107 291.45 76.693
hp      1    49.550 302.90 77.925
drat    1    30.148 283.49 75.806

ANOVA

(Regression with categorical variables)

ANOVA

data(npk)
head(npk, n = 10)
   block N P K yield
1      1 0 1 1  49.5
2      1 1 1 0  62.8
3      1 0 0 0  46.8
4      1 1 0 1  57.0
5      2 1 0 0  59.8
6      2 1 1 1  58.5
7      2 0 0 1  55.5
8      2 0 1 0  56.0
9      3 0 1 0  62.8
10     3 1 1 1  55.8

ANOVA

(ignoring block)

N: 2 levels, 0 and 1
P: 2 levels, 0 and 1
K: 2 levels, 0 and 1

\[ Y_{ijk+} = \beta_0 + \beta_1N +\beta_2P + \beta_3K \]

ANOVA

(with block)

6 levels of block
block 1: 0 only
block 2: 0 and 1
block 3: 0 and 1
….(blocks 4, 5, 6): 0 and 1

(set one level as level zero or reference level)

ID block1 block 2 block 3 block 4 block 5 block 6
A 0 1 0 0 0 0
B 0 0 1 0 0 0
C 0 0 0 1 0 0
D 0 0 0 0 1 0
E 0 0 0 0 0 1

\[ Y_{npk} = \beta_0 +...+ \beta_4 Bl2 + \beta_5 Bl3 + \beta_6 Bl4 + \beta_7 Bl5 + \beta_8 Bl6\]

ANOVA

m3 <- lm(yield ~N + P + K + block, data = npk)
summary(m3)

Call:
lm(formula = yield ~ N + P + K + block, data = npk)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.0000 -1.7083 -0.0833  2.2458  6.4833 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   53.800      2.450  21.955 8.13e-13 ***
N1             5.617      1.634   3.438  0.00366 ** 
P1            -1.183      1.634  -0.724  0.47999    
K1            -3.983      1.634  -2.438  0.02767 *  
block2         3.425      2.830   1.210  0.24483    
block3         6.750      2.830   2.386  0.03068 *  
block4        -3.900      2.830  -1.378  0.18831    
block5        -3.500      2.830  -1.237  0.23512    
block6         2.325      2.830   0.822  0.42412    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.002 on 15 degrees of freedom
Multiple R-squared:  0.7259,    Adjusted R-squared:  0.5798 
F-statistic: 4.966 on 8 and 15 DF,  p-value: 0.003761

ANOVA

Type 1 sums of squares

anova(m3) 
Analysis of Variance Table

Response: yield
          Df Sum Sq Mean Sq F value  Pr(>F)   
N          1 189.28 189.282 11.8210 0.00366 **
P          1   8.40   8.402  0.5247 0.47999   
K          1  95.20  95.202  5.9455 0.02767 * 
block      5 343.29  68.659  4.2879 0.01272 * 
Residuals 15 240.19  16.012                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA

Type 3 sums of squares

drop1(m3, test = "F") 
Single term deletions

Model:
yield ~ N + P + K + block
       Df Sum of Sq    RSS    AIC F value  Pr(>F)   
<none>              240.19 73.281                   
N       1    189.28 429.47 85.228 11.8210 0.00366 **
P       1      8.40 248.59 72.106  0.5247 0.47999   
K       1     95.20 335.39 79.294  5.9455 0.02767 * 
block   5    343.29 583.48 84.583  4.2879 0.01272 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sums of Squares

Type 1: depends on order of independent variables
Type 3: best for factorials
Type 2: use if you are sure there are no interactions

These only matter if the data are unbalanced

Decent explanation

Mixed Models

Mixed Models

library(lme4); library(lmerTest)
m4 <- lmer(yield ~N + P + K + (1|block), data = npk)
m4
Linear mixed model fit by REML ['lmerModLmerTest']
Formula: yield ~ N + P + K + (1 | block)
   Data: npk
REML criterion at convergence: 128.057
Random effects:
 Groups   Name        Std.Dev.
 block    (Intercept) 3.628   
 Residual             4.002   
Number of obs: 24, groups:  block, 6
Fixed Effects:
(Intercept)           N1           P1           K1  
     54.650        5.617       -1.183       -3.983  

Mixed Models

fixef(m4) # fixed effects
(Intercept)          N1          P1          K1 
  54.650000    5.616667   -1.183333   -3.983333 
ranef(m4) # random effects
$block
  (Intercept)
1   -0.651767
2    1.974470
3    4.524029
4   -3.642227
5   -3.335513
6    1.131007

with conditional variances for "block" 

Mixed Models

sigma(m4)
[1] 4.001541
print(VarCorr(m4), comp = c("Variance", "Std.Dev"))
 Groups   Name        Variance Std.Dev.
 block    (Intercept) 13.162   3.6279  
 Residual             16.012   4.0015  
confint(m4)
                2.5 %     97.5 %
.sig01       1.201048  7.3624368
.sigma       2.720672  5.2696730
(Intercept) 50.396060 58.9039408
N1           2.530694  8.7026395
P1          -4.269306  1.9026395
K1          -7.069306 -0.8973605

Mixed Models

A large number of functions are providing model conditions, not something estimated.

family(m4) # distribution (Gaussian, etc)
na.action(m4) # how missing data were handled
isREML(m4) # yes/no was REML used?
formula(m4) # formula used
getData(m4) # return data set 
nobs(m4) # data set size
ngrps(m4) # number of random groups

Mixed Models

So many methods

methods(class = "merMod")
 [1] anova          as.function    coef           confint        cooks.distance
 [6] deviance       df.residual    drop1          extractAIC     family        
[11] fitted         fixef          formula        fortify        getData       
[16] getL           getME          hatvalues      influence      isGLMM        
[21] isLMM          isNLMM         isREML         isSingular     logLik        
[26] model.frame    model.matrix   na.action      ngrps          nobs          
[31] plot           predict        print          profile        ranef         
[36] refit          refitML        rePCA          residuals      rstudent      
[41] show           sigma          simulate       summary        terms         
[46] update         VarCorr        vcov           weights       
see '?methods' for accessing help and source code
methods(class = "lmerModLmerTest")
 [1] anova       coerce      coerce<-    contest     contest1D   contestMD  
 [7] difflsmeans drop1       getL        isSingular  ls_means    lsmeansLT  
[13] show        step        summary     update     
see '?methods' for accessing help and source code

Mixed Models

  • Test random effects with a log likihood ratio test
ranova(m4)
ANOVA-like table for random-effects: Single term deletions

Model:
yield ~ N + P + K + (1 | block)
            npar  logLik    AIC    LRT Df Pr(>Chisq)  
<none>         6 -64.029 140.06                       
(1 | block)    5 -66.388 142.78 4.7194  1    0.02982 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Final Thoughts

  • \(R^2\), log likelihood, and AIC/BIC are ways to evaluate model fit
  • the coefficient for regression can be directly interpreted; ANOVA coefficients are more complex
  • Use type 3 sums of squares if you’re unclear
  • linear modelling objects have a large amount of output, use methods() to explore this content
  • ANOVA is a special case of regression; they are both linear models

Additional Resources