\Large \boldsymbol{Y} \sim \mathcal{N}(\boldsymbol{Y}, \sigma^2\boldsymbol{I}) \Large \boldsymbol{\hat{Y}} = \boldsymbol{\beta X}
This is the Matrix formulation of \sum\beta_jx_j
This equation is huge. X can be anything - categorical, continuous, squared, sine, etc.
There can be straight additivity, or interactions
So far, the only model we've used with >1 predictor is ANOVA
Mixing Categorical and Continuous Variables
Nonlinearities
Interaction Effects
\Large \boldsymbol{Y} = \boldsymbol{\beta X}
translates to
\widehat{y_i} = \beta_0 + \beta_{1}x_1 + \sum_{j=2}^{k}\beta_{ij}x_{ij}
x_{ij} = 0,1
Here, we have many levels of a group in x_{ij}
Often used to correct for a gradient or some continuous variable affecting outcome
Who had a bigger brain: Neanderthals or us?
lnbrain_i \sim \mathcal{N}(\widehat{lnbrain_i}, \sigma_2)
\widehat{lnbrain_i} = \beta_0 + \beta_{1}lnmass_1 + \sum_{j=2}^{k}\beta_{ij}species_{ij}
species_{ij} = 0\; or\; 1\; if\; neandertal
Evaluate a categorical effect(s), controlling for a covariate (parallel lines)
Groups modify the intercept.
neand_lm <- lm(lnbrain ~ species + lnmass, data=neand)neand_dat <- augment(neand_lm)
term | sumsq | df | statistic | p.value |
---|---|---|---|---|
species | 0.028 | 1 | 6.204 | 0.017 |
lnmass | 0.130 | 1 | 29.276 | 0.000 |
Residuals | 0.160 | 36 | NA | NA |
term | sumsq | df | statistic | p.value |
---|---|---|---|---|
species | 0.028 | 1 | 6.204 | 0.017 |
lnmass | 0.130 | 1 | 29.276 | 0.000 |
Residuals | 0.160 | 36 | NA | NA |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 5.188 | 0.395 | 13.126 | 0.000 |
speciesrecent | 0.070 | 0.028 | 2.491 | 0.017 |
lnmass | 0.496 | 0.092 | 5.411 | 0.000 |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 5.188 | 0.395 | 13.126 | 0.000 |
speciesrecent | 0.070 | 0.028 | 2.491 | 0.017 |
lnmass | 0.496 | 0.092 | 5.411 | 0.000 |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 5.188 | 0.395 | 13.126 | 0.000 |
speciesrecent | 0.070 | 0.028 | 2.491 | 0.017 |
lnmass | 0.496 | 0.092 | 5.411 | 0.000 |
Intercept is the brain mass of a neanderthal with 0 body mass
speciesrecent is the change in brain mass for a recent human, still assuming 0 body mass
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 5.188 | 0.395 | 13.126 | 0.000 |
speciesrecent | 0.070 | 0.028 | 2.491 | 0.017 |
lnmass | 0.496 | 0.092 | 5.411 | 0.000 |
Intercept is the brain mass of a neanderthal with 0 body mass
speciesrecent is the change in brain mass for a recent human, still assuming 0 body mass
lnmass is the association of log body mass with log brain mass
Mixing Categorical and Continuous Variables
Nonlinearities
Interaction Effects
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \sum \beta_{j}x_{ij}
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \sum \beta_{j}x_{ij}
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \sum \beta_{j}x_{ij}
What if x_1 is a linear term and x_2 = x_1^2?
Adding nonlinear terms is just adding more predictors to a multiple linear regression model
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \sum \beta_{j}x_{ij}
What if x_1 is a linear term and x_2 = x_1^2?
Adding nonlinear terms is just adding more predictors to a multiple linear regression model
Sometimes to reduce collinearity/make our model make more sense, we need to center x - i.e. use x-\bar{x} - to reduce SE in parameters
mod_sq <- lm(rich ~ cover + I(cover^2), data = keeley)
order | mse |
---|---|
1 | 210.27 |
2 | 200.16 |
3 | 208.85 |
4 | 196.36 |
5 | 211.24 |
6 | 1323.11 |
Mixing Categorical and Continuous Variables
Nonlinearities
Interaction Effects
The effect of one predictor cannot be known without knowing the level of another
Common in nature!
The effect of one predictor cannot be known without knowing the level of another
Common in nature!
Some are inevitable (e.g., the effects of disturbance depend on something being there to disturb)
The effect of one predictor cannot be known without knowing the level of another
Common in nature!
Some are inevitable (e.g., the effects of disturbance depend on something being there to disturb)
Some are.... quite tricky, but the spice of biological life!
The effect of one predictor cannot be known without knowing the level of another
Common in nature!
Some are inevitable (e.g., the effects of disturbance depend on something being there to disturb)
Some are.... quite tricky, but the spice of biological life!
Exercise: Think of an interaction effect you have encountered in biology or elsewhere!
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \beta_{3}x_{1i}x_{2i} x_{2i} = 0,1
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \beta_{3}x_{1i}x_{2i} x_{2i} = 0,1 If we consider x_1 * x_2 a variable in it's own right - i.e. x_3 then..... this is just.......
y_{i} \sim N(\widehat{y_{i}}, \sigma^{2} ) \widehat{y_{i}} = \beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \beta_{3}x_{1i}x_{2i} x_{2i} = 0,1 If we consider x_1 * x_2 a variable in it's own right - i.e. x_3 then..... this is just.......
\widehat{y_{i}} = \beta_{0} + \sum\beta_{j}x_{ij}
IT'S ALL THE SAME THING!
mod_int <- lm(rich ~ firesev * age_break, data = keeley)
term | sumsq | df | statistic | p.value |
---|---|---|---|---|
firesev | 1918.314 | 1 | 10.122 | 0.002 |
age_break | 202.176 | 1 | 1.067 | 0.305 |
firesev:age_break | 1060.005 | 1 | 5.593 | 0.020 |
Residuals | 16298.526 | 86 | NA | NA |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 53.464 | 6.581 | 8.123 | 0.000 |
firesev | 0.117 | 1.626 | 0.072 | 0.943 |
age_breakold stand | 16.348 | 8.961 | 1.824 | 0.072 |
firesev:age_breakold stand | -4.730 | 2.000 | -2.365 | 0.020 |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 53.464 | 6.581 | 8.123 | 0.000 |
firesev | 0.117 | 1.626 | 0.072 | 0.943 |
age_breakold stand | 16.348 | 8.961 | 1.824 | 0.072 |
firesev:age_breakold stand | -4.730 | 2.000 | -2.365 | 0.020 |
The intercept is the number of species when fire severity is 0 and stand age is young
The firesev effect is the effect of fire for young stands
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 53.464 | 6.581 | 8.123 | 0.000 |
firesev | 0.117 | 1.626 | 0.072 | 0.943 |
age_breakold stand | 16.348 | 8.961 | 1.824 | 0.072 |
firesev:age_breakold stand | -4.730 | 2.000 | -2.365 | 0.020 |
The intercept is the number of species when fire severity is 0 and stand age is young
The firesev effect is the effect of fire for young stands
The age effect is the increase in species in an old stand - but only if fire severity is 0
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 53.464 | 6.581 | 8.123 | 0.000 |
firesev | 0.117 | 1.626 | 0.072 | 0.943 |
age_breakold stand | 16.348 | 8.961 | 1.824 | 0.072 |
firesev:age_breakold stand | -4.730 | 2.000 | -2.365 | 0.020 |
The intercept is the number of species when fire severity is 0 and stand age is young
The firesev effect is the effect of fire for young stands
The age effect is the increase in species in an old stand - but only if fire severity is 0
The interaction is the change in the fire severity slope when the stand is old
y_{i} \sim \mathcal{N}(\widehat{y_{i}}, \sigma^2) \widehat{y_{i}} = \beta_{0} + \beta_{1}x_{1i} + \beta_{2}x_{2i} + \beta_{3}x_{1i}x_{2i}
mod_int_c <- lm(rich ~ firesev * age, data = keeley)
term | sumsq | df | statistic | p.value |
---|---|---|---|---|
firesev | 1361.509 | 1 | 7.877 | 0.006 |
age | 466.926 | 1 | 2.701 | 0.104 |
firesev:age | 2228.687 | 1 | 12.894 | 0.001 |
Residuals | 14865.094 | 86 | NA | NA |
Modnames | K | AIC | Delta_AIC | AICWt |
---|---|---|---|---|
fire*age | 5 | 725.035 | 0.000 | 0.991 |
fire + age | 4 | 735.608 | 10.573 | 0.005 |
age | 3 | 736.034 | 10.998 | 0.004 |
fire | 3 | 740.506 | 15.470 | 0.000 |
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 42.935 | 7.851 | 5.469 | 0.000 |
firesev | 3.106 | 1.863 | 1.667 | 0.099 |
age | 0.827 | 0.313 | 2.641 | 0.010 |
firesev:age | -0.230 | 0.064 | -3.591 | 0.001 |
X_i - \bar{X}
Additive coefficients are now the effect of a predictor at the mean value of the other predictors
Intercepts are at the mean value of all predictors
Visualization will keep you from getting confused!
Whew! That's a lot of ways to build a model!
Note, we are still thinking in terms of normal error - and an additive model
Whew! That's a lot of ways to build a model!
Note, we are still thinking in terms of normal error - and an additive model
There are plenty of extensions
Whew! That's a lot of ways to build a model!
Note, we are still thinking in terms of normal error - and an additive model
There are plenty of extensions
Whew! That's a lot of ways to build a model!
Note, we are still thinking in terms of normal error - and an additive model
There are plenty of extensions
But, additive Gaussian models are a safe starting point
AND extensions.... really all just have linear models at their core
Whew! That's a lot of ways to build a model!
Note, we are still thinking in terms of normal error - and an additive model
There are plenty of extensions
But, additive Gaussian models are a safe starting point
AND extensions.... really all just have linear models at their core
It's the BIOLOGY and your intuition of what model to build that should always be the core of your modeling process
\Large \boldsymbol{Y} \sim \mathcal{N}(\boldsymbol{Y}, \sigma^2\boldsymbol{I}) \Large \boldsymbol{\hat{Y}} = \boldsymbol{\beta X}
This is the Matrix formulation of \sum\beta_jx_j
This equation is huge. X can be anything - categorical, continuous, squared, sine, etc.
There can be straight additivity, or interactions
So far, the only model we've used with >1 predictor is ANOVA
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |