This tutorial covers how the multilevel model can be used to examine within-person associations and how those associations are moderted by between-person differences. Our example is developed using experience sampling data (repeated occasions nested within persons), but also applies to other kinds of nested data.
In this session we cover …
A. Preparation and description of variables for use in Multilevel Model
B. Setting up the Multilevel Model
C. Using the Mutlilevel Model to Examine Between-Person Differences in Within-Person Associations
D. Plotting and Probing Interactions
The overall set-up of the models follows Bolger & Laurenceau (2013) Chapters 4 and 5.
if (!require(pacman)) { install.packages("pacman"); library(pacman) }
p_load(
backports, # to revive the isFALSE() function for sim_slopes()
effects, # for probing interactions
ggplot2, # for data visualization
interactions, # for probing/plotting interactions
lme4, # for multilevel models
lmerTest, # for p-values
psych, # for describing the data
plyr # for data manipulation
)
Our example makes use of the person-level and day-level (EMA-type) AMIB data files, publically available data from an experience sampling study. Information about the data can be found on the QuantDev website.
We make use of three variables:
Outcome: daily negative affect (continuous), the negaff
variable in the day-level data file;
Time-varying Predictor: daily stress, a stress
variable obtained through reverse coding of pss
in the day-level data file;
Person-level Predictor/Moderator: trait neuroticism, the bfi_n
variable in the person-level data file.
Loading person-level file (N = 190) and subsetting to variables of interest
#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/AMIBshare_persons_2019_0501.csv"
#read in the .csv file using the url() function
AMIB_persons <- read.csv(file=url(filepath),header=TRUE)
#subsetting to variables of interest
AMIB_persons <- AMIB_persons[ ,c("id","bfi_n")]
Loading day-level file (T = 8) and subsetting to variables of interest.
#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/AMIBshare_daily_2019_0501.csv"
#read in the .csv file using the url() function
AMIB_daily <- read.csv(file=url(filepath),header=TRUE)
#subsetting to variables of interest
AMIB_daily <- AMIB_daily[ ,c("id","day","negaff","pss")]
In this example data are prepared through some recoding (idiosyncratic for this data set), and separation of time-varying predictor variables into between-person and within-person components (typical for all multilevel analysis)
Data Recoding
Reverse code pss
into a new stress
variable where higher values indicate higher perceived stress.
#reverse coding the pss variable into a new stress variable
AMIB_daily$stress <- 4 - AMIB_daily$pss
#describing new variable
describe(AMIB_daily$stress)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 1445 1.39 0.68 1.25 1.36 0.74 0 4 4 0.35 0.13
## se
## X1 0.02
#histogram
ggplot(data=AMIB_daily, aes(x=stress)) +
geom_histogram(fill="white", color="black",bins=20) +
labs(x = "Stress (high = more stressed)")
Data Preparation (between- and within- components)
We now split the time-varying predictor into “trait” (between-person differences) and “state” (within-person deviations) components. Specifically, the daily variable stress
is split into two varaibles: stress_trait
is the sample-mean centered between-person component, and stress_state
is the person-centered within-person component.
Although not formally needed, we do this for all the time-varying variables (predictors, outcomes) for easier plotting later.
Making trait variables.
#calculating intraindividual means
#Alternative approach using dplyr
#AMIB_imeans <- AMIB_daily %>%
# group_by(id) %>%
# summarize(stress_trait=mean(stress, na.rm=TRUE))
AMIB_imeans <- ddply(AMIB_daily, "id", summarize,
stress_trait = mean(stress, na.rm=TRUE),
negaff_trait = mean(negaff, na.rm=TRUE))
describe(AMIB_imeans)
## vars n mean sd median trimmed mad min max
## id 1 190 318.29 130.44 321.50 318.99 151.23 101.00 532.00
## stress_trait 2 190 1.40 0.48 1.41 1.39 0.51 0.19 2.56
## negaff_trait 3 190 2.48 0.73 2.41 2.43 0.72 1.11 5.09
## range skew kurtosis se
## id 431.00 -0.04 -1.09 9.46
## stress_trait 2.38 -0.04 -0.23 0.03
## negaff_trait 3.98 0.68 0.45 0.05
#merging into person-level file
AMIB_persons <- merge(AMIB_persons, AMIB_imeans, by="id")
#make centered versions of the person-level scores
AMIB_persons$bfi_n_c <- scale(AMIB_persons$bfi_n,center=TRUE,scale=FALSE)
AMIB_persons$stress_trait_c <- scale(AMIB_persons$stress_trait,center=TRUE,scale=FALSE)
#describe person-level data
describe(AMIB_persons)
## vars n mean sd median trimmed mad min max
## id 1 190 318.29 130.44 321.50 318.99 151.23 101.00 532.00
## bfi_n 2 190 2.98 0.96 3.00 3.00 1.48 1.00 5.00
## stress_trait 3 190 1.40 0.48 1.41 1.39 0.51 0.19 2.56
## negaff_trait 4 190 2.48 0.73 2.41 2.43 0.72 1.11 5.09
## bfi_n_c 5 190 0.00 0.96 0.02 0.02 1.48 -1.98 2.02
## stress_trait_c 6 190 0.00 0.48 0.01 0.00 0.51 -1.21 1.17
## range skew kurtosis se
## id 431.00 -0.04 -1.09 9.46
## bfi_n 4.00 -0.09 -0.82 0.07
## stress_trait 2.38 -0.04 -0.23 0.03
## negaff_trait 3.98 0.68 0.45 0.05
## bfi_n_c 4.00 -0.09 -0.82 0.07
## stress_trait_c 2.38 -0.04 -0.23 0.03
Making state variables in long data (as deviations from uncentered trait variable).
#merging person-level data into daily data
daily_long <- merge(AMIB_daily,AMIB_persons,by="id")
#calculating state variables
daily_long$stress_state <- daily_long$stress - daily_long$stress_trait
daily_long$negaff_state <- daily_long$negaff - daily_long$negaff_trait
#describing data
describe(daily_long)
## vars n mean sd median trimmed mad min max
## id 1 1458 322.53 129.08 324.00 324.20 151.23 101.00 532.00
## day 2 1458 3.48 2.30 3.00 3.47 2.97 0.00 7.00
## negaff 3 1441 2.45 1.04 2.20 2.34 1.04 1.00 6.90
## pss 4 1445 2.61 0.68 2.75 2.64 0.74 0.00 4.00
## stress 5 1445 1.39 0.68 1.25 1.36 0.74 0.00 4.00
## bfi_n 6 1458 2.97 0.96 3.00 2.98 1.48 1.00 5.00
## stress_trait 7 1458 1.39 0.47 1.41 1.39 0.51 0.19 2.56
## negaff_trait 8 1458 2.45 0.71 2.38 2.41 0.67 1.11 5.09
## bfi_n_c 9 1458 -0.01 0.96 0.02 0.00 1.48 -1.98 2.02
## stress_trait_c 10 1458 -0.01 0.47 0.01 -0.01 0.51 -1.21 1.17
## stress_state 11 1445 0.00 0.49 -0.03 -0.02 0.46 -1.75 2.12
## negaff_state 12 1441 0.00 0.76 -0.10 -0.04 0.63 -3.62 3.16
## range skew kurtosis se
## id 431.00 -0.07 -1.06 3.38
## day 7.00 0.01 -1.24 0.06
## negaff 5.90 0.96 0.77 0.03
## pss 4.00 -0.35 0.13 0.02
## stress 4.00 0.35 0.13 0.02
## bfi_n 4.00 -0.08 -0.79 0.03
## stress_trait 2.38 -0.08 -0.21 0.01
## negaff_trait 3.98 0.66 0.52 0.02
## bfi_n_c 4.00 -0.08 -0.79 0.03
## stress_trait_c 2.38 -0.08 -0.21 0.01
## stress_state 3.88 0.36 0.79 0.01
## negaff_state 6.78 0.50 1.70 0.02
Great! Now the data are all prepared as needed.
Set-up for basic multilevel model with continuous outcome
Outlining the substantive inquiry.
We define stress reactivty (a person-level dynamic characteristic; Ram & Gerstorf, 2009) as the extent to which an individual’s daily negative affect is related to daily stress. That is, stress reactivity is the within-person association between daily negative affect and daily stress. Lets examine some of our prepared data to see what stress reactivity looks like.
Plotting within-person associations for a subset of individuals.
#faceted plot
ggplot(data=daily_long[which(daily_long$id <= 125),], aes(x=stress_state,y=negaff)) +
geom_point() +
stat_smooth(method="lm", fullrange=TRUE) +
xlab("Stress State") + ylab("Negative Affect (Continuous)") +
facet_wrap( ~ id) +
theme(axis.title=element_text(size=16),
axis.text=element_text(size=14),
strip.text=element_text(size=14))
To constrain the regression line within the range of stress scores for each person (instead of extrapolating to the full range of stress scores), specify fullrange=FALSE
in the stat_smooth
layer above.
From the panel of plots we get a sense that individuals’ negative affect is higher on days where their stress is higher, but to a different extent for each person. These differences in stress reactivity are indicated by differences in the slope of the regression lines.
Our substantive interest is in describing individual differences in stress reactivity and determining if those differences are related to differences in neuroticism.
The basic linear multilevel model is written as
\[y_{it} = \beta_{0i} + \beta_{1i}x_{it} + e_{it}\] where the time-varying outcome variable, \(y_{it}\) (negative affect) is modeled as a function of a person-specific intercept, \(\beta_{0i}\), a person-specific slope, \(\beta_{1i}\), that captures the within-person association of interest (in our case stress reactivity), and residual error, \(e_{it}\). The person-specific intercepts and slopes are modeled in turn as
\[\beta_{0i} = \gamma_{00} + \gamma_{01}z_{i} + u_{0i}\] \[\beta_{1i} = \gamma_{10} + \gamma_{11}z_{i} + u_{1i}\]
where the fixed effects \(\gamma_{00}\) and \(\gamma_{01}\) describe the prototypical individual’s intercept and slope; \(\gamma_{10}\) and \(\gamma_{11}\) indicate how individual differences in the person-specific intercepts and slopes are related to a between-person differences variable, \(z_{i}\); and the random effects \(u_{0i}\) and \(u_{1i}\) are residual unexplained differences in intercepts and slopes.
Importantly, \[e_{it} \tilde{} N(0,\mathbf{R})\], and \[\mathbf{U}_{i} \tilde{} MVN(0,\mathbf{G})\]
\[\mathbf{R} = \mathbf{I}\sigma^{2}_{e}\], where where \(I\) is the identity matrix (diagonal matrix of 1s) and \(\sigma^{2}_{e}\) is the residual (within-person) variance.
\[\mathbf{G} = \left[\begin{array} {rr} \sigma^{2}_{u0} & \sigma_{u0u1} \\ \sigma_{u1u0} & \sigma^{2}_{u1} \end{array}\right]\]
Together, the full set of parameters provide a parsimonious description of the data that allows for inference about within-person associations and between-person differeneces in those associations.
Now that we have defined the dynamic characteristic of interest - stress reactivity - we can operationalize it using the multilevel model.
We make use of the lme4
package for fitting mixed effects models, and some supplmentary packages: lmerTest
provides tools for obtaining p-values; effects
provides tools for calculating and plotting model-based predictions; and interactions
provides tools for plotting and probing interactions.
The lmer()
function fits the MLMs The data
argument specifies the data sources The na.action
argument specifies what to do with missing data
To start, we often fit an unconditional means model that provides us information about how much of the total variance in the outcome varaible is within-person variance and how much is between-person variance.
Fitting the unconditional means model.
#unconditional means model
model0_fit <- lmer(formula = negaff ~ 1 + (1|id),
data=daily_long,
na.action=na.exclude)
summary(model0_fit)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: negaff ~ 1 + (1 | id)
## Data: daily_long
##
## REML criterion at convergence: 3833.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8739 -0.6123 -0.1608 0.4658 3.9394
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 0.4270 0.6535
## Residual 0.6627 0.8141
## Number of obs: 1441, groups: id, 190
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 2.46368 0.05229 185.80793 47.12 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We extract the random effects with the VarCorr() function:
VarCorr(model0_fit)
## Groups Name Std.Dev.
## id (Intercept) 0.65347
## Residual 0.81408
And then compute the intra-class correlation (ICC) as the ratio of the random intercept variance (between-person) to the total variance, defined as the sum of the random intercept variance and residual variance (between + within). Specifically, \[ICC_{between} = \frac{\sigma^{2}_{u0}}{\sigma^{2}_{u0} + \sigma^{2}_{e}}\]
Calculating the ICC.
Store the random effect variances, which will be the first column of the VarCorr object (see above).
RandomEffects <- as.data.frame(VarCorr(model0_fit))
RandomEffects
## grp var1 var2 vcov sdcor
## 1 id (Intercept) <NA> 0.4270294 0.6534749
## 2 Residual <NA> <NA> 0.6627260 0.8140798
Next, compute the ICC. It is the ratio of the random intercept variance (between-person var) over the total variance (between + within var):
ICC_between <- RandomEffects[1,4]/(RandomEffects[1,4]+RandomEffects[2,4])
ICC_between
## [1] 0.391858
From the unconditional means model, the ICC was calculated, which indicated that of the total variance in negative affect, 39.19% is attributable to between-person variation whereas 60.81% is attributatable to within-person variation.
This means there is a good portion of within-person variance to model using time-varying predictor.
OK - let’s add the predictor, which was split into two components stress_trait
(between-person component) and stress_state
(within-person component), that latter gives us possibility to quantify each individual’s stress reactivity. Note that we also include day
as a time-varying predictor, but simply as a control variable to account for any systematic time trends in the data.
# fit model
model1_fit <- lmer(formula = negaff ~ 1 + day + stress_trait_c +
stress_state + stress_state:stress_trait_c +
(1 + stress_state|id),
data=daily_long,
na.action=na.exclude)
summary(model1_fit)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula:
## negaff ~ 1 + day + stress_trait_c + stress_state + stress_state:stress_trait_c +
## (1 + stress_state | id)
## Data: daily_long
##
## REML criterion at convergence: 3162.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.5368 -0.6127 -0.0729 0.5093 4.4164
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.2135 0.4621
## stress_state 0.1257 0.3546 0.53
## Residual 0.4038 0.6355
## Number of obs: 1438, groups: id, 190
##
## Fixed effects:
## Estimate Std. Error df t value
## (Intercept) 2.695e+00 4.583e-02 3.922e+02 58.806
## day -6.580e-02 7.552e-03 1.250e+03 -8.713
## stress_trait_c 1.038e+00 7.946e-02 1.859e+02 13.067
## stress_state 7.647e-01 4.561e-02 1.664e+02 16.765
## stress_trait_c:stress_state 1.550e-01 9.780e-02 1.584e+02 1.585
## Pr(>|t|)
## (Intercept) <2e-16 ***
## day <2e-16 ***
## stress_trait_c <2e-16 ***
## stress_state <2e-16 ***
## stress_trait_c:stress_state 0.115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) day strs__ strss_
## day -0.569
## strss_trt_c 0.004 0.007
## stress_stat 0.216 0.012 0.002
## strss_tr_:_ 0.034 -0.057 0.268 -0.118
# save predicted scores
daily_long$pred_m1 <- predict(model1_fit)
Interpreting and Plotting the Results
The results indicate that there is an association between perceived stress and negative affect, both across persons (between-person association) and within persons (average within-person association). These are described by the Fized Effects parameters.
We can also get confidence intervals for the fixed and random effects. Depending on model complexity, the confint()
can sometimes take a while to run.
# Get confidence intervals for both fixed and random effects
confint(model1_fit)
## Computing profile confidence intervals ...
## 2.5 % 97.5 %
## .sig01 0.40389453 0.52247647
## .sig02 0.27635941 0.77129774
## .sig03 0.25230528 0.45069365
## .sigma 0.60962953 0.66244318
## (Intercept) 2.60530434 2.78468855
## day -0.08058845 -0.05099125
## stress_trait_c 0.88249222 1.19390741
## stress_state 0.67339999 0.85449743
## stress_trait_c:stress_state -0.03788636 0.34703328
The labels for the random effects are not entirely intuitive (e.g., sig01
, sig02
, sig03
, sigma
) - especially for models with several random effects. So, need to check carefully how to interpret the output.
In the above output:
In general, the output given from the confint()
will give you the confidence intervals (given in SD units - will need to square these values if you want the variances) for the first random effect (e.g., the random intercept). Then it will give you the correlations between that random effect and all other random effects it is correlated with. The next row of CI values will correspond to the second random effect (e.g., the random slope for stress_state), with the following rows pertaining to all random effects it is correlated with (but not include any correlations already given - because for example, the correlation for the random intercept and random slope for stress_state was already given). The last row pertaining to random effects is always called sigma
, and pertains to the residual error (given in SD units - will need to square these values if you want the variances).
Next, we can plot between-person associations.
ggplot(data=AMIB_imeans, aes(x=stress_trait, y=negaff_trait, group=factor(id)), legend=FALSE) +
geom_point(colour="gray40") +
geom_smooth(aes(group=1), method=lm, se=FALSE, fullrange=FALSE, lty=1, size=2, color="blue") +
xlab("Trait Stress") + ylab("Trait Negative Affect") +
theme_classic() +
theme(axis.title=element_text(size=16),
axis.text=element_text(size=12),
plot.title=element_text(size=16, hjust=.5)) +
ggtitle("Between-Person Association Plot\nTrait Stress & Negative Affect")
Note that the plot is not actually using the model output - so it is just an approximation of the exact model (using geom_smooth embedded within ggplot). However, it is a very useful display.
Plotting predicted within-person associations.
ggplot(data=daily_long, aes(x=stress_state, y=negaff_state, group=factor(id), colour="gray"), legend=FALSE) +
geom_smooth(method=lm, se=FALSE, fullrange=FALSE, lty=1, size=.5, color="gray40") +
geom_smooth(aes(group=1), method=lm, se=FALSE, fullrange=FALSE, lty=1, size=2, color="blue") +
xlab("Stress State") + ylab("Predicted State Negative Affect") +
theme_classic() +
theme(axis.title=element_text(size=18),
axis.text=element_text(size=14),
plot.title=element_text(size=18, hjust=.5)) +
ggtitle("Within-Person Association Plot\nPerceived Stress & Negative Affect")
## Warning: Removed 20 rows containing non-finite values (stat_smooth).
## Warning: Removed 20 rows containing non-finite values (stat_smooth).
Note that the plot is not actually using the model output - so it is just an approximation of the exact model (using geom_smooth with state scores embedded within ggplot). However, it is a very useful display.
Adding another Person-level Predictor
We now add trait neuroticism to the model, to see if and how between-person differences in neuroticism are related to stress reactivity.
Running the expanded model.
# fit model
model2_fit <- lmer(formula = negaff ~ 1 + day + stress_trait_c +
bfi_n_c + stress_trait_c:bfi_n_c +
stress_state + stress_state:stress_trait_c +
stress_state:bfi_n_c + stress_state:stress_trait_c:bfi_n_c +
(1 + stress_state|id),
data=daily_long,
na.action=na.exclude)
#Look at results
summary(model2_fit)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula:
## negaff ~ 1 + day + stress_trait_c + bfi_n_c + stress_trait_c:bfi_n_c +
## stress_state + stress_state:stress_trait_c + stress_state:bfi_n_c +
## stress_state:stress_trait_c:bfi_n_c + (1 + stress_state | id)
## Data: daily_long
##
## REML criterion at convergence: 3161.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.4271 -0.6011 -0.0749 0.5045 4.4732
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.1955 0.4422
## stress_state 0.1238 0.3518 0.51
## Residual 0.4040 0.6356
## Number of obs: 1438, groups: id, 190
##
## Fixed effects:
## Estimate Std. Error df
## (Intercept) 2.690e+00 4.556e-02 3.943e+02
## day -6.545e-02 7.572e-03 1.247e+03
## stress_trait_c 9.695e-01 7.878e-02 1.835e+02
## bfi_n_c 1.543e-01 3.917e-02 1.809e+02
## stress_state 7.687e-01 4.682e-02 1.673e+02
## stress_trait_c:bfi_n_c 3.715e-02 7.833e-02 1.824e+02
## stress_trait_c:stress_state 1.254e-01 1.015e-01 1.692e+02
## bfi_n_c:stress_state 7.595e-02 4.845e-02 1.549e+02
## stress_trait_c:bfi_n_c:stress_state -3.167e-02 1.024e-01 1.722e+02
## t value Pr(>|t|)
## (Intercept) 59.053 < 2e-16 ***
## day -8.644 < 2e-16 ***
## stress_trait_c 12.306 < 2e-16 ***
## bfi_n_c 3.938 0.000117 ***
## stress_state 16.418 < 2e-16 ***
## stress_trait_c:bfi_n_c 0.474 0.635849
## stress_trait_c:stress_state 1.235 0.218473
## bfi_n_c:stress_state 1.568 0.119006
## stress_trait_c:bfi_n_c:stress_state -0.309 0.757566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) day strs__ bf_n_c strss_ st__:__ st__:_ bf__:_
## day -0.576
## strss_trt_c 0.003 0.005
## bfi_n_c -0.008 0.007 -0.224
## stress_stat 0.204 0.004 0.000 0.001
## strss_t_:__ -0.179 0.008 0.008 0.040 -0.054
## strss_tr_:_ 0.042 -0.073 0.248 -0.056 -0.087 0.003
## bf_n_c:str_ -0.033 0.058 -0.058 0.258 0.016 0.009 -0.244
## strs__:__:_ -0.061 0.032 0.004 0.008 -0.233 0.243 -0.103 -0.059
# save predicted scores
daily_long$pred_m2 <- predict(model2_fit)
Probing and Plotting the Interaction
The moderation is not significant. However, if it were, we would want to plot and probe the interaction term, specifically the stress_state:bfi_n_c
term.
We can use the effect
function.
We examine what the effect is as various levels of the predictors. Standard practice is to use +1 and -1 SD.
#xlevels = is the list of values that we want to evaluate at.
#obtaining the standard deviation of the between-person moderator
describe(daily_long$bfi_n_c)
## vars n mean sd median trimmed mad min max range skew
## X1 1 1458 -0.01 0.96 0.02 0 1.48 -1.98 2.02 4 -0.08
## kurtosis se
## X1 -0.79 0.03
#obtaining the standard deviation of the time-varying predictor
describe(daily_long$stress_state)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 1445 0 0.49 -0.03 -0.02 0.46 -1.75 2.12 3.88 0.36 0.79
## se
## X1 0.01
#calculate effect
effects_model2 <- effect(term="bfi_n_c*stress_state", mod=model2_fit,
xlevels=list(bfi_n_c=c(-0.96, +0.96), stress_state=c(-0.49,+0.49)))
## NOTE: bfi_n_c:stress_state is not a high-order term in the model
summary(effects_model2)
##
## bfi_n_c*stress_state effect
## stress_state
## bfi_n_c -0.49 0.49
## -0.96 1.964451 2.644774
## 0.96 2.188158 3.011999
##
## Lower 95 Percent Confidence Limits
## stress_state
## bfi_n_c -0.49 0.49
## -0.96 1.858007 2.510682
## 0.96 2.080616 2.876434
##
## Upper 95 Percent Confidence Limits
## stress_state
## bfi_n_c -0.49 0.49
## -0.96 2.070894 2.778866
## 0.96 2.295700 3.147563
Everything we need is in there!
We can convert to a data frame and plot it!
#convert to dataframe
effectsdata <- as.data.frame(effects_model2)
#plotting the effect evaluation (with standard error ribbon)
ggplot(data=effectsdata, aes(x=stress_state, y=fit, group=bfi_n_c), legend=FALSE) +
geom_point() +
geom_line() +
#geom_ribbon(aes(ymin=lower, ymax=upper), alpha=.3) +
geom_errorbar(aes(ymin=lower, ymax=upper), width=.15) +
xlab("Stress State") + xlim(-2,2) +
ylab("Predicted Negative Affect") + ylim(1,7) +
ggtitle("Differences in Stress Reactivity across Neuroticism")
Note that the non-significant interaction appears as two practically parallel lines.
Johnson-Neyman Technique
Let’s go a bit further …
We would like to identify the range of values of the moderator at which the within-person slope is significant. Formally, the Johnson-Neyman technique is used to probe significant interactions in order to determine for what values of the moderator the focal predictor is significantly related to the outcome - i.e., to identify the region of significance. This is quickly becoming standard practice in reporting about interactions in multilevel models.
In our case, we are interested to know the range of neuroticism scores at which we would expect individuals to have significant stress reactivity (within-person association between daily stress and daily negative affect).
We make use of the johnson-neyman()
function in the interactions
package. Note that the backports
package may also be needed to get a deprecated isFALSE()
function that this package uses.
Also, when using the interactions
package we need to do a little data formatting (to make sure that all variables are vectors), and to run the model using lmer() without the lmerTest() overlay (to obtain a merMod object).
Fixing data.
#check for embedded matrices
str(daily_long)
## 'data.frame': 1458 obs. of 14 variables:
## $ id : int 101 101 101 101 101 101 101 101 102 102 ...
## $ day : int 0 1 2 3 4 5 6 7 0 1 ...
## $ negaff : num 3 2.3 1 1.3 1.1 1 1.2 1.1 1.4 1.6 ...
## $ pss : num 2.5 2.75 3.5 3 2.75 2.75 3.5 2.75 3.5 4 ...
## $ stress : num 1.5 1.25 0.5 1 1.25 1.25 0.5 1.25 0.5 0 ...
## $ bfi_n : num 2 2 2 2 2 2 2 2 2 2 ...
## $ stress_trait : num 1.06 1.06 1.06 1.06 1.06 ...
## $ negaff_trait : num 1.5 1.5 1.5 1.5 1.5 ...
## $ bfi_n_c : num [1:1458, 1] -0.982 -0.982 -0.982 -0.982 -0.982 ...
## $ stress_trait_c: num [1:1458, 1] -0.333 -0.333 -0.333 -0.333 -0.333 ...
## $ stress_state : num 0.4375 0.1875 -0.5625 -0.0625 0.1875 ...
## $ negaff_state : num 1.5 0.8 -0.5 -0.2 -0.4 ...
## $ pred_m1 : num 2.12 1.91 1.4 1.63 1.71 ...
## $ pred_m2 : num 2.1 1.89 1.39 1.61 1.69 ...
#fixing a few variables that were matrices to be vectors
daily_long$bfi_n_c <- as.vector(daily_long$bfi_n_c)
daily_long$stress_trait_c <- as.vector(daily_long$stress_trait_c)
Specify and fit model as specific lme4::lmer
to get merMod
object (rather than a lmerModLmerTest
object).
# fit model
model2a_fit <- lme4::lmer(formula = negaff ~ 1 + day + stress_trait_c +
bfi_n_c + stress_trait_c:bfi_n_c +
stress_state + stress_state:stress_trait_c +
stress_state:bfi_n_c + stress_state:stress_trait_c:bfi_n_c +
(1 + stress_state|id),
data=daily_long,
na.action=na.exclude)
# Look at results
summary(model2a_fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## negaff ~ 1 + day + stress_trait_c + bfi_n_c + stress_trait_c:bfi_n_c +
## stress_state + stress_state:stress_trait_c + stress_state:bfi_n_c +
## stress_state:stress_trait_c:bfi_n_c + (1 + stress_state | id)
## Data: daily_long
##
## REML criterion at convergence: 3161.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.4271 -0.6011 -0.0749 0.5045 4.4732
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.1955 0.4422
## stress_state 0.1238 0.3518 0.51
## Residual 0.4040 0.6356
## Number of obs: 1438, groups: id, 190
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.690416 0.045560 59.053
## day -0.065452 0.007572 -8.644
## stress_trait_c 0.969537 0.078785 12.306
## bfi_n_c 0.154266 0.039170 3.938
## stress_state 0.768705 0.046821 16.418
## stress_trait_c:bfi_n_c 0.037151 0.078327 0.474
## stress_trait_c:stress_state 0.125402 0.101524 1.235
## bfi_n_c:stress_state 0.075952 0.048450 1.568
## stress_trait_c:bfi_n_c:stress_state -0.031668 0.102428 -0.309
##
## Correlation of Fixed Effects:
## (Intr) day strs__ bf_n_c strss_ st__:__ st__:_ bf__:_
## day -0.576
## strss_trt_c 0.003 0.005
## bfi_n_c -0.008 0.007 -0.224
## stress_stat 0.204 0.004 0.000 0.001
## strss_t_:__ -0.179 0.008 0.008 0.040 -0.054
## strss_tr_:_ 0.042 -0.073 0.248 -0.056 -0.087 0.003
## bf_n_c:str_ -0.033 0.058 -0.058 0.258 0.016 0.009 -0.244
## strs__:__:_ -0.061 0.032 0.004 0.008 -0.233 0.243 -0.103 -0.059
# Get confidence intervals for both fixed and random effects
#confint(model2a_fit)
# Save predicted scores
daily_long$pred_m2a <- predict(model2a_fit)
# Fit statistics
AIC(logLik(model2a_fit))
## [1] 3187.776
BIC(logLik(model2a_fit))
## [1] 3256.299
logLik(logLik(model2a_fit))
## 'log Lik.' -1580.888 (df=13)
Plotting and probing the simple slopes (within-person association between stress_state
and negaff
) across the full range of the moderator (bfi_n_c
).
#probing 2-way interaction
johnson_neyman(model=model2a_fit, pred=stress_state, modx=bfi_n_c)
## JOHNSON-NEYMAN INTERVAL
##
## When bfi_n_c is INSIDE the interval [-4.45, 40.14], the slope of
## stress_state is p < .05.
##
## Note: The range of observed values of bfi_n_c is [-1.98, 2.02]
We see from the output and the plot that the within-person stress reactivity slopes are significant at all observed values of neuroticism - as we would expect given the non-significance of the interaction term.
We can also probe the 3-way interaction, provided we choose specific values for the 2nd moderator.
#obtaining the standard deviation of the 2nd moderator
describe(daily_long$stress_trait_c)
## vars n mean sd median trimmed mad min max range skew
## X1 1 1458 -0.01 0.47 0.01 -0.01 0.51 -1.21 1.17 2.38 -0.08
## kurtosis se
## X1 -0.21 0.01
#probing 3-way interaction
# simpleslopes_model2a <- sim_slopes(model=model2a_fit, pred=stress_state, modx=bfi_n_c,
# mod2=stress_trait_c,mod2.values = c(-0.47,+0.47))
# plot(simpleslopes_model2a)
#ran, but did not knit
From the output we see that the value of the within-person stress reactivity slopes (i.e., the simple slopes) are similar across all values of the moderators and have overlapping confidence intervals - as we would expect given the non-significance of the interaction term.
As you try these techniques out with the AMIB data, please let us know if you find a set of variables that has the significant interactions that make for interesting results and theory. =:-]
This tutorial illustrated application of the basic multilevel model (where the outcome variable is assumed normally distributed - and how interactions are probed and plotted.
Good luck! =:-]