Measures of global fit in SEM provide information about how well the model fits the data. Importantly, these statistics attempt to quantify the overall recovery of the observed data without typically considering specific components of fit or misfit in each element of the mean and covariance structure.
Global fit statistics can be divided into absolute and comparative fit indices. Absolute indices are often a function of the test statistic \(T\), which quantifies global fit to the population covariance structure (see model chi-square goodness of fit below). Absolute fit indices can also be a function of the model residuals.
Comparative fit indices compare a candidate model (specified by you) against a baseline model, which is a minimal model containing only variances for observed endogenous variables, but not covariances among them. Thus, the baseline model represents the view that there are no meaningful relationships among variables. Comparative fit indices describe how much better your model fits the data compared to this independence representation.
Global fit statistics can also be divided into tests of goodness versus badness of model fit, so it’s important to remember what are acceptable ranges for each.
\[ \boldsymbol{\Sigma} = \boldsymbol{\Lambda}_y \boldsymbol{\Psi} \boldsymbol{\Lambda}_y' + \boldsymbol{\Theta}_\varepsilon \]
Reminders about SEM parameterization:
The model \(\chi^2\) test is the most common global fit index in SEM and is a component of several other fit indices. It is a test of the null hypothesis that the model-implied covariance matrix \(\hat{\boldsymbol{\Sigma}}({\boldsymbol{\theta}})\) equals the population covariance \(\boldsymbol{\Sigma}\). Therefore, we would prefer not to reject the null hypothesis since doing so would be indicative of misfit.
In standard ML, model \(\chi^2\) is computed as the product of the minimum value of \(\hat{F}\) estimated in the optimization algorithm (often expectation-maximization [EM]). That is, the minimum value, denoted \(f\), represents the fit of the model to the data given a vector of parameters with the greatest sample likelihood. The model \(\chi^2\) is then computed:
\[ T = (N-1) f \] which is (asymptotically) distributed as \(\chi^2\) with \(df = p - q\). Thus, we can test the null hypothesis of perfect model fit in the population given value \(T\) and \(df\). People often test this hypothesis at \(\alpha = .05\), but as we will discuss below, the \(\chi^2\) test can be ‘significant’ in cases where there is only trivial misfit, especially in large samples.
The model \(\chi^2\) is a specific case of a likelihood ratio test (LRT) in which two models are compared according to differences in their log likelihoods (LL) and degrees of freedom. In the case of the model \(\chi^2\) test, we are comparing the fit of our proposed SEM against the perfect fit of a saturated model, which estimates all variances and covariances individually (i.e., using \(p\) parameters).
More generally, LRTs can be used to test fit differences in nested models, where model A is considered nested in model B if the free parameters of A are a subset of the parameters in B. In this situation, the difference in model fit can be computed as the difference in model \(\chi^2\) values, which is also distributed according to a \(\chi^2\) distribution. The degrees of freedom for a nested LRT is the difference in the individual degrees of freedom for each model:
\[ \begin{align*} \Delta \chi^2 &= \chi^2_\textrm{Fewer} - \chi^2_\textrm{More} \\ \Delta df &= df_\textrm{Fewer} - df_\textrm{More} \end{align*} \] where \(Fewer\) refers to the model with fewer parameters that is nested in the \(More\) model.
Let’s walk through an example. Say that we are interested in how well fear of global warming (FoGWar) and concern for animal welfare (CoAWe) predict donations to charities involved in environmental protection. We measure FoGWar and CoAWe using two self-report scales with four items each. We measure donations in terms of total dollars to five large charities most involved in environmental protection (i.e., observed variable). We predict that latent standing (factor scores) on both FoGWar and CoAWe will be associated with greater donations.
demo.model <- '
l_FoGWar =~ .8*FoGWar1 + .8*FoGWar2 + .8*FoGWar3 + .8*FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ .8*CoAWe5 + .8*CoAWe6 + .8*CoAWe7 + .8*CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_FoGWar =~ .2*CoAWe5 #small cross-loading
donations ~ 0.3*l_FoGWar + 0.3*l_CoAWe
l_FoGWar ~~ 0.4*l_CoAWe
#equal (unit) residual variance
FoGWar1 ~~ 1*FoGWar1
FoGWar2 ~~ 1*FoGWar2
FoGWar3 ~~ 1*FoGWar3
FoGWar4 ~~ 1*FoGWar4
CoAWe5 ~~ 1*CoAWe5
CoAWe6 ~~ 1*CoAWe6
CoAWe7 ~~ 1*CoAWe7
CoAWe8 ~~ 1*CoAWe8
'
# generate data; note, standardized lv is default
simData <- lavaan::simulateData(demo.model, sample.nobs=500)
round(cor(simData), 2)
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe5 CoAWe6 CoAWe7 CoAWe8
## FoGWar1 1.00 0.34 0.33 0.35 0.17 0.16 0.12 0.14
## FoGWar2 0.34 1.00 0.43 0.40 0.27 0.23 0.18 0.25
## FoGWar3 0.33 0.43 1.00 0.40 0.22 0.07 0.15 0.17
## FoGWar4 0.35 0.40 0.40 1.00 0.18 0.15 0.07 0.16
## CoAWe5 0.17 0.27 0.22 0.18 1.00 0.44 0.47 0.46
## CoAWe6 0.16 0.23 0.07 0.15 0.44 1.00 0.43 0.38
## CoAWe7 0.12 0.18 0.15 0.07 0.47 0.43 1.00 0.43
## CoAWe8 0.14 0.25 0.17 0.16 0.46 0.38 0.43 1.00
## donations 0.16 0.26 0.16 0.22 0.24 0.30 0.26 0.26
## donations
## FoGWar1 0.16
## FoGWar2 0.26
## FoGWar3 0.16
## FoGWar4 0.22
## CoAWe5 0.24
## CoAWe6 0.30
## CoAWe7 0.26
## CoAWe8 0.26
## donations 1.00
#pairwise combinations of observed variables
pairwise <- combn(x = names(simData), m = 2)
msat.syntax <- paste(apply(pairwise, 2, function(pair) { paste(pair, collapse="~~")}), collapse="\n")
msat <- sem(msat.syntax, simData)
First, let’s check the parameterization of the saturated model. Is it identified? How many degrees of freedom?
inspect(msat)
## $lambda
##
## FoGWar1
## FoGWar2
## FoGWar3
## FoGWar4
## CoAWe5
## CoAWe6
## CoAWe7
## CoAWe8
## donations
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe5 CoAWe6 CoAWe7 CoAWe8 dontns
## FoGWar1 37
## FoGWar2 1 38
## FoGWar3 2 9 39
## FoGWar4 3 10 16 40
## CoAWe5 4 11 17 22 41
## CoAWe6 5 12 18 23 27 42
## CoAWe7 6 13 19 24 28 31 43
## CoAWe8 7 14 20 25 29 32 34 44
## donations 8 15 21 26 30 33 35 36 45
##
## $psi
## <0 x 0 matrix>
Next, let’s examine fit and parameter values:
summary(msat, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 41 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 0.000
## Degrees of freedom 0
## Minimum Function Value 0.0000000000000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 945.926
## Degrees of freedom 36
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.000
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7025.937
## Loglikelihood unrestricted model (H1) -7025.937
##
## Number of free parameters 45
## Akaike (AIC) 14141.875
## Bayesian (BIC) 14331.532
## Sample-size adjusted Bayesian (BIC) 14188.700
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent Confidence Interval 0.000 0.000
## P-value RMSEA <= 0.05 NA
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.000
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## FoGWar1 ~~
## FoGWar2 0.581 0.080 7.291 0.000 0.581 0.345
## FoGWar3 0.559 0.079 7.079 0.000 0.559 0.334
## FoGWar4 0.588 0.079 7.419 0.000 0.588 0.352
## CoAWe5 0.299 0.079 3.765 0.000 0.299 0.171
## CoAWe6 0.263 0.073 3.580 0.000 0.263 0.162
## CoAWe7 0.212 0.080 2.646 0.008 0.212 0.119
## CoAWe8 0.240 0.075 3.195 0.001 0.240 0.144
## donations 0.231 0.064 3.613 0.000 0.231 0.164
## FoGWar2 ~~
## FoGWar3 0.727 0.082 8.835 0.000 0.727 0.430
## FoGWar4 0.668 0.081 8.233 0.000 0.668 0.396
## CoAWe5 0.468 0.082 5.732 0.000 0.468 0.265
## CoAWe6 0.380 0.075 5.065 0.000 0.380 0.233
## CoAWe7 0.319 0.082 3.905 0.000 0.319 0.177
## CoAWe8 0.416 0.077 5.393 0.000 0.416 0.249
## donations 0.368 0.066 5.596 0.000 0.368 0.258
## FoGWar3 ~~
## FoGWar4 0.669 0.081 8.283 0.000 0.669 0.399
## CoAWe5 0.386 0.080 4.816 0.000 0.386 0.221
## CoAWe6 0.118 0.073 1.613 0.107 0.118 0.072
## CoAWe7 0.269 0.081 3.323 0.001 0.269 0.150
## CoAWe8 0.289 0.075 3.825 0.000 0.289 0.174
## donations 0.224 0.064 3.499 0.000 0.224 0.158
## FoGWar4 ~~
## CoAWe5 0.308 0.079 3.871 0.000 0.308 0.176
## CoAWe6 0.236 0.073 3.212 0.001 0.236 0.145
## CoAWe7 0.129 0.080 1.614 0.106 0.129 0.072
## CoAWe8 0.274 0.075 3.634 0.000 0.274 0.165
## donations 0.312 0.065 4.822 0.000 0.312 0.221
## CoAWe5 ~~
## CoAWe6 0.748 0.083 9.025 0.000 0.748 0.441
## CoAWe7 0.873 0.092 9.475 0.000 0.873 0.468
## CoAWe8 0.795 0.085 9.308 0.000 0.795 0.458
## donations 0.349 0.068 5.144 0.000 0.349 0.236
## CoAWe6 ~~
## CoAWe7 0.749 0.084 8.885 0.000 0.749 0.433
## CoAWe8 0.616 0.077 7.987 0.000 0.616 0.382
## donations 0.405 0.064 6.339 0.000 0.405 0.296
## CoAWe7 ~~
## CoAWe8 0.760 0.086 8.826 0.000 0.760 0.430
## donations 0.389 0.070 5.583 0.000 0.389 0.258
## CoAWe8 ~~
## donations 0.365 0.065 5.632 0.000 0.365 0.260
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## FoGWar1 1.671 0.106 15.811 0.000 1.671 1.000
## FoGWar2 1.701 0.108 15.811 0.000 1.701 1.000
## FoGWar3 1.678 0.106 15.811 0.000 1.678 1.000
## FoGWar4 1.675 0.106 15.811 0.000 1.675 1.000
## CoAWe5 1.829 0.116 15.811 0.000 1.829 1.000
## CoAWe6 1.573 0.099 15.811 0.000 1.573 1.000
## CoAWe7 1.902 0.120 15.811 0.000 1.902 1.000
## CoAWe8 1.647 0.104 15.811 0.000 1.647 1.000
## donations 1.195 0.076 15.811 0.000 1.195 1.000
semPaths_default(msat, layout="circle")
Note that the \(LL\) for our model (notated H0
) is identical to the saturated model implicitly fit by lavaan
(notated H1
). This is because we explicitly fit a saturated model, though any just identified model would have equivalent fit. So, yes, the saturated model is just identified (\(df = 0\)), and it has 45 free parameters.
Note that in real life, we never know the ‘true’ model. To make this example more like real data, I’ve simulated the data with a small cross loading of CoAWe5
(the first item on the concern for animal welfare scale) onto the fear of global warming factor. Here, we include this in the SEM to verify that we have a good fit to the data when we use a good model (the groundtruth in this case).
msem.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_FoGWar =~ CoAWe5 #allow small cross-loading
donations ~ l_FoGWar + l_CoAWe #predict criterion based on SEM
l_FoGWar ~~ l_CoAWe
'
msem <- sem(msem.syntax, simData)
semPaths_default(msem)
What about identification and parameterization?
How many free parameters do we have?
inspect(msem)
## $lambda
## l_FGWr l_CoAW dontns
## FoGWar1 0 0 0
## FoGWar2 1 0 0
## FoGWar3 2 0 0
## FoGWar4 3 0 0
## CoAWe5 7 0 0
## CoAWe6 0 4 0
## CoAWe7 0 5 0
## CoAWe8 0 6 0
## donations 0 0 0
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe5 CoAWe6 CoAWe7 CoAWe8 dontns
## FoGWar1 11
## FoGWar2 0 12
## FoGWar3 0 0 13
## FoGWar4 0 0 0 14
## CoAWe5 0 0 0 0 15
## CoAWe6 0 0 0 0 0 16
## CoAWe7 0 0 0 0 0 0 17
## CoAWe8 0 0 0 0 0 0 0 18
## donations 0 0 0 0 0 0 0 0 0
##
## $psi
## l_FGWr l_CoAW dontns
## l_FoGWar 20
## l_CoAWe 10 21
## donations 0 0 19
##
## $beta
## l_FGWr l_CoAW dontns
## l_FoGWar 0 0 0
## l_CoAWe 0 0 0
## donations 8 9 0
summary(msem, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 35 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 33.905
## Degrees of freedom 24
## P-value (Chi-square) 0.086
##
## Model test baseline model:
##
## Minimum Function Test Statistic 945.926
## Degrees of freedom 36
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.989
## Tucker-Lewis Index (TLI) 0.984
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7042.890
## Loglikelihood unrestricted model (H1) -7025.937
##
## Number of free parameters 21
## Akaike (AIC) 14127.780
## Bayesian (BIC) 14216.287
## Sample-size adjusted Bayesian (BIC) 14149.632
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.029
## 90 Percent Confidence Interval 0.000 0.050
## P-value RMSEA <= 0.05 0.954
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.030
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.688 0.532
## FoGWar2 1.288 0.142 9.069 0.000 0.886 0.679
## FoGWar3 1.183 0.134 8.814 0.000 0.814 0.628
## FoGWar4 1.152 0.132 8.711 0.000 0.792 0.612
## l_CoAWe =~
## CoAWe5 1.000 0.897 0.663
## CoAWe6 0.889 0.090 9.891 0.000 0.797 0.636
## CoAWe7 1.030 0.102 10.134 0.000 0.924 0.670
## CoAWe8 0.920 0.092 9.950 0.000 0.825 0.643
## l_FoGWar =~
## CoAWe5 0.151 0.103 1.463 0.144 0.104 0.077
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.328 0.094 3.491 0.000 0.226 0.206
## l_CoAWe 0.374 0.071 5.273 0.000 0.335 0.307
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.245 0.050 4.884 0.000 0.397 0.397
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.198 0.089 13.466 0.000 1.198 0.717
## .FoGWar2 0.916 0.086 10.641 0.000 0.916 0.539
## .FoGWar3 1.016 0.086 11.876 0.000 1.016 0.606
## .FoGWar4 1.047 0.086 12.203 0.000 1.047 0.625
## .CoAWe5 0.940 0.083 11.340 0.000 0.940 0.514
## .CoAWe6 0.938 0.075 12.466 0.000 0.938 0.596
## .CoAWe7 1.049 0.089 11.767 0.000 1.049 0.551
## .CoAWe8 0.966 0.078 12.327 0.000 0.966 0.587
## .donations 0.971 0.065 14.956 0.000 0.971 0.813
## l_FoGWar 0.473 0.087 5.423 0.000 1.000 1.000
## l_CoAWe 0.804 0.127 6.348 0.000 1.000 1.000
We see strong evidence of global fit: we do not reject the model \(\chi^2\), CFI is > .95, RMSEA is < .06 (and its confidence interval does not include .06), and SRMR is < .08. Thus, at a global level, this model characterizes the data well.
Let’s compare the saturated model against the specified two-factor SEM. Technically, all SEM models are nested in the saturated model.
anova(msat, msem)
Df | AIC | BIC | Chisq | Chisq diff | Df diff | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|
msat | 0 | 14142 | 14332 | 0.0 | NA | NA | NA |
msem | 24 | 14128 | 14216 | 33.9 | 33.9 | 24 | 0.086 |
Notice how the LRT exactly equals the model chi-square test of the two-factor SEM specified above. This is because the model chi-square test is a nested model comparison against the saturated model. An important point is that our candidate model has 24 fewer parameters than the saturated model but does not fit significantly worse. That’s evidence of a good fit:parsimony tradeoff.
What about a nested model in which we remove one of the measurement parameters? In particular, the cross-loading is an interpretive nuisance (not simple structure). Let’s remove the cross-loading. We are fixing this path to be exactly zero, which means that we are estimating one less parameter, but the rest of the model structure is the same. Thus, the simpler (reduced) model is nested in the more complex model.
mnested.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
donations ~ l_FoGWar + l_CoAWe #predict criterion based on SEM
l_FoGWar ~~ l_CoAWe
'
mnested <- sem(mnested.syntax, simData)
summary(mnested, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 34 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 36.005
## Degrees of freedom 25
## P-value (Chi-square) 0.072
##
## Model test baseline model:
##
## Minimum Function Test Statistic 945.926
## Degrees of freedom 36
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.988
## Tucker-Lewis Index (TLI) 0.983
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7043.940
## Loglikelihood unrestricted model (H1) -7025.937
##
## Number of free parameters 20
## Akaike (AIC) 14127.880
## Bayesian (BIC) 14212.172
## Sample-size adjusted Bayesian (BIC) 14148.690
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.030
## 90 Percent Confidence Interval 0.000 0.050
## P-value RMSEA <= 0.05 0.951
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.031
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.688 0.533
## FoGWar2 1.289 0.142 9.073 0.000 0.887 0.680
## FoGWar3 1.177 0.134 8.798 0.000 0.810 0.625
## FoGWar4 1.150 0.132 8.710 0.000 0.792 0.612
## l_CoAWe =~
## CoAWe5 1.000 0.952 0.704
## CoAWe6 0.832 0.074 11.288 0.000 0.793 0.632
## CoAWe7 0.960 0.082 11.666 0.000 0.914 0.663
## CoAWe8 0.866 0.076 11.420 0.000 0.824 0.642
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.318 0.095 3.365 0.001 0.219 0.200
## l_CoAWe 0.353 0.067 5.294 0.000 0.336 0.307
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.279 0.049 5.660 0.000 0.426 0.426
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.197 0.089 13.452 0.000 1.197 0.716
## .FoGWar2 0.913 0.086 10.594 0.000 0.913 0.537
## .FoGWar3 1.022 0.086 11.921 0.000 1.022 0.609
## .FoGWar4 1.048 0.086 12.195 0.000 1.048 0.626
## .CoAWe5 0.922 0.083 11.084 0.000 0.922 0.504
## .CoAWe6 0.945 0.075 12.636 0.000 0.945 0.601
## .CoAWe7 1.067 0.088 12.054 0.000 1.067 0.561
## .CoAWe8 0.968 0.078 12.452 0.000 0.968 0.587
## .donations 0.971 0.065 14.957 0.000 0.971 0.813
## l_FoGWar 0.474 0.087 5.428 0.000 1.000 1.000
## l_CoAWe 0.907 0.116 7.807 0.000 1.000 1.000
Note that the model chi-square is still non-significant, suggesting good fit relative to the population covariance matrix. But is the fit worse than the model that includes the cross-loading?
anova(msem, mnested)
Df | AIC | BIC | Chisq | Chisq diff | Df diff | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|
msem | 24 | 14128 | 14216 | 33.9 | NA | NA | NA |
mnested | 25 | 14128 | 14212 | 36.0 | 2.1 | 1 | 0.147 |
No, we do not see a significant degradation in fit by excluding the cross-loading parameter, suggesting we should retain the model with simple structure. That is, the LRT \(p\) value of .15 suggests that the model with a freed cross-loading parameter is not significantly better than the model that the model with simple structure.
Although the model \(\chi^2\) is a common and useful test of global fit, it is not without flaws. Foremost among these is that the power to detect trivial misfit increases toward 1.0 as the sample size \(N\) increases. That is, in large samples, we will reject the null hypothesis of good fit even if the model is excellent. This problem applies to comparison among nested SEMs using the LRT (difference in \(\chi^2\)) approach, too.
A related problem is that overemphasis on model \(\chi^2\) may lead some researchers to prefer small samples in which the null hypothesis (of good fit) is not rejected. However, in small samples, parameters will be poorly estimated, and it is the parameters, not fit tests, that are usually of substantive interest. Thus, there is a risk of encouraging bad scientific practices (collecting small samples to avoid rejection the model \(\chi^2\) test).
A second problem is that the model statistic \(T\) does not follow a \(\chi^2\) distribution if the data are not multivariate normal or when \(N\) is small (see assumptions above). There have been additional developments in the SEM literature for estimators that are robust to non-normality, especially the Satorra-Bentler correction to the \(\chi^2\)
A third problem is that in small samples, even when other assumptions are met, the model \(\chi^2\) test can reject the null hypothesis even when the model fits well… so yeah, you can’t win (reject with small or large \(N\))! On top of that, in small samples, the test can fail to detect serious misfit.
Altogether, many of the other fit indices that have been developed for SEM try to overcome dependence on \(N\). Here’s the tl;dr synopsis from the West et al. chapter.
The minimal set suggested by Kline (and I agree) is:
The ratio of misfit, quantified by \(\chi^2\), to the model degrees of freedom, \(df\), is an absolute fit index. The intuition is that for every free parameter removed from the model, the \(\chi^2\) should increase by about 1.0. That is, the expected value of \(\chi^2\) for a correct model is \(df\). Thus, in larger, more complex models, this ratio may be more indicative of fit than the \(\chi^2\). The \(\chi^2\), which is proportionate to the sample likelihood, is sensitive the case where there are many trivial deviations between the sample covariance matrix and the model-implied covariance matrix. Like \(R^2\) in regression, penalties for these deviations can never be negative, so the model \(\chi^2\) can become quite large in models with many free parameters.
The \(\chi^2/df\) ratio is intended to help correct for such problems. Rules of thumb for this ratio in well-fitting models are 5, 4, or sometimes even 2. As the ratio approaches zero, we believe the model is a poor description of the data. Importantly, \(\chi^2/df\) is a badness of fit index such that we become concerned at lower values, but there is no upper bound that is ideal. Also, because of the sample size dependence of \(\chi^2\), the \(\chi^2/df\) ratio is also dependent on \(N\), leading it to be more of an ad hoc heuristic than a go-to fit index.
The Goodness-of-Fit Index (GFI) and Adjusted Goodness-of-Fit Index (AGFI) provide an absolute metric for how well the model describes the data. The GFI is conceptually related to the proportion of variance explained by the model, as in regression. Thus, it ranges between 0 (bad) and 1 (perfect).
The AGFI extends this metric by adjusting for the \(df\) of the model, downweighting goodness in proportion to the number of free parameters. This is analogous to adjusted \(R^2\) in regression, where a model with many predictors is penalized in its overall explanatory power because of the positive variance explained by each.
Although these are reasonable metrics, they are not often used in practice.
The root mean residual (RMR) and standardized root mean residual (SRMR) provide an estimate of the average misfit for each estimated versus observed variance/covariance parameter. That is, the RMR quantifies the square root of the average of the squared residuals comparing the model-implied covariance matrix and observed covariance matrix:
\[\sqrt{\sum_p (\hat{\boldsymbol{\Sigma}}({\boldsymbol{\theta}})_p - \boldsymbol{S}_p)^2 } \]
where \(p\) is my sloppy notation for the pth unique element of the covariance matrix. Note that this gives an intuitive estimate of the level of average misfit in the model-implied covariance matrix in the units of the original data. The RMR is most interpretable when all variables are on a similar scale. RMR is a badness of fit index in that large values are indicative of misfit.
One virtue of RMR is that it is not sensitive to variables with small unique (residual) variances. The weighting procedure of many other fit measures (e.g., CFI or GFI) can misstate the badness of fit in this case.
The SRMR solves the problem of scaling the metric of RMR. Because RMR is in the original units, it is not comparable across studies and is hard to interpret when the variances of variables are disparate. The SRMR standardizes the average residuals in terms of the magnitude of elements in \(\boldsymbol{S_{XX}}\). Thus, it ranges between 0 and 1, where 1 represents residuals that are as large, on average, as a given element of the sample covariance matrix.
The rule of thumb for SRMR is < .08.
The root mean squared error of approximation (RMSEA) is a badness of fit index that quantifies level of approximate fit (as opposed to perfect fit in model \(\chi^2\)). The RMSEA seeks to examine whether a model exceeds a reasonable level of close fit to the data, which would indicate misfit. It is based on the fact that the null hypothesis test (model \(T\)) follows a \(\chi^2\) distribution, whereas evidence for the alternative follows a non-central \(\chi^2\) distribution, whose non-centrality is governed by a parameter \(\lambda\).
Conceptually, RMSEA penalizes more complex models by dividing the non-centrality term by the degrees of freedom. This is a form of a parsimony-adjusted index such that simpler models are preferred. Furthermore, RMSEA is in the same units as SRMR, namely the size of the average element in \(\boldsymbol{S_{XX}}\). Thus, it ranges from 0-1. Values above .10 indicate serious misfit. Values below .06 are considered good.
Many people encourage the best practice of reporting a 90% CI on RMSEA to see how close the model is to the boundary of misfit. Some programs also report a test on the null hypothesis that RMSEA is less than or equal to .05. One hopes not to reject this hypothesis since low RMSEA is a good thing.
Mechanics of RMSEA
Close fit is defined as:
\[ \hat{\Delta}_M = \textrm{max}(0, \chi^2_M - df_M) \]
When the \(\chi^2_M\) is less than \(df_M\), then this metric is zero, indicating no departure from close fit. The term \(\hat{\Delta}_M\) is the noncentrality parameter in a \(\chi^2\) distribution that examines how well the model fits the population covariance matrix. When \(\hat{\Delta}_M = 0\), this reduces to a standard \(\chi^2\) testing close fit. But in general, the model will not perfectly fit the population, and its deviation according to RMSEA is:
\[ \hat{\varepsilon} = \sqrt{\frac{\hat{\Delta_m}}{df_M (N-1) }} \] Thus, the intuition is that the deviation becomes smaller when a) \(df_M\) is larger (i.e., more parsimonious model) and b) the sample size increases. In larger samples, the correction for \(df_M\) will be much smaller than the correction for \(N\), indicating a weaker penalty for complex models in large samples
The Tucker-Lewis index (TLI), which is also called the non-normed fit index (NNFI), ranges between 0 and 1, with larger values indicating better relative fit. The typical rule of thumb is to judge a model as good if TLI > .95. It is related to the \(\chi^2/df\) ratio described above, but instead of relying on rules of thumb, it compares the ratio of your model relative to a ‘true model’ (where the expected \(\chi^2/df_M\) ratio of such a model is 1) and the baseline (independence) model.
That is, the TLI represents the distance (in terms of fit) between your model and the independence model as a proportion of the distance between the independence model and the saturated model. Because the independence model is lousy by definition and the saturated model is perfect (fit-wise) by definition, the distance between these models (e.g., in terms of an LRT) represents a benchmark for how good (or bad) fit can be. Hence, if we can achieve 90 or 95% of this distance using a candidate model, but with many fewer parameters, we would believe our model fits well.
More formally, the TLI is defined as:
\[ TLI = 1 - \frac{(\chi^2_k - df_k) / (\chi^2_0 - df_0)}{df_k/df_0} \] where \(k\) is the candidate model and \(0\) is the independence model. Note that the TLI can technically go below 0 or above 1, which is confusing and unlikely, but nevertheless, know that this metrics is not formally bounded. If you see a value below 0, your model fits very badly (worse than independence), or paradoxically, the independence model fits well. If the value is above 1, your model fits very well, or the independence model is especially bad.
The normed fit index (NFI) is related to TLI in that it quantifies the distance between a candidate model and the baseline. It does not use the \(\chi^2/df\) ratios, but instead is based on the log-likelihood of the models or model \(\chi^2\). Similar to TLI, it represents the improvement in fit of a candidate model against the baseline (independence) as a proportion of the baseline fit.
NFI ranges between 0 and 1, and values > .95 are considered ‘good.’
\[ \textrm{NFI} = \frac{\chi^2_0 - \chi^2_k}{\chi^2_0} \]
Note that NFI does not penalize for complexity, which may encourage a researcher to overfit the data. Likewise, it is affected by sample size, with small \(N\) underestimating NFI.
In an attempt to handle this underestimation problem, Bollen introduced the incremental fit index (IFI), but this metric introduced several new problems and is not recommended.
The comparative fit index (CFI) is goodness of fit index that ranges between 0 and 1, with values above .95 indicating ‘good’ fit. It is conceptually related to the TLI in that it expresses the fit of a candidate model against an independence model. It is related to RMSEA in that both rely on the idea that the evidence for a candidate model follows a noncentral \(\chi^2\) distribution. More specifically, CFI considers how much the noncentrality parameter is reduced as one moves from a baseline model to a candidate model (in terms of fit).
The CFI is quantified in terms of the relative difference between fit (model \(\chi^2\)) and degrees of freedom for both candidate and baseline models:
\[ \textrm{CFI} = \frac{\textrm{max}(\chi^2_0 - df_0, 0) - \textrm{max}(\chi^2_k - df_k, 0)}{\textrm{max}(\chi^2_0 - df_0, 0)} \] By taking the max of each operand, CFI is formally bounded between 0 and 1. Importantly, CFI is not affected by \(N\) and has performed well in simulation studies relative to similar indices such as TLI. Finally, note that there is a very similar measure called relative noncentrality index (RNI) that differs from CFI primarily in that it can go above 1.0 in the same way as TLI.
Many, if not most, of the practical indices described above cannot be interpreted blindly according to the rules of thumb above. Indeed, these rules of thumb were primarily developed for models containing only covariance structure, but not means. We’ll get to mean structure later in the class, but suffice it to say that most fit indices struggle to balance misfit in the mean structure and covariance structure. Is a model that fits one well, but not the other, considered ‘good?’ For example, SRMR typically ignores mean structure altogether (since it is based on comparisons of covariance elements), so a model with poor representation of means could have a low SRMR.
There is also a challenge of specifying an appropriate baseline model in more complex SEMs, especially ones that include mean structure. Likewise, some models do not have a proper saturated model, such as models with specific patterns of missing data. The short version is to interpret global fit indices with caution and always to examine the fit of each part of the model (residuals are useful).
One may be tempted to modify a model according to modification indices in order to improve global fit, but atheoretically (fit) motivated modifications are unlikely to replicate in an independent sample. As West notes, model modification can often shift the epistemological perspective from confirmatory (a priori model) to (partially) exploratory. Exercise caution!
This brings us to the topic of ‘local fit’, by which I mean how well each part of the model captures the data. For example, in a two-factor model, does each factor fit the data well in isolation? Things typically only get worse when we consider a model built from poorly fitting elements. That is, global fit statistics can be a) sensitive to misfit in a single bad component amidst many good components, b) insensitive to a bad component when many are good, and/or c) sensitive to mild misfit in many reasonable components when these are considered together.
Tomarken and Waller (2003, J Abnormal Psych) argue for the importance of testing meaningful partitions of the model in order to verify good fit in each. Most commonly, one can separately test the measurement and structural models. But in larger models (esp. multi-factor or hierarchical factor), testing individual components of measurement is often important. Likewise, testing a sub-part of the SEM (e.g., both measurement and structural components for self-report data, but not the informant data) is useful.
They note the following problems with models that appear to fit well according to global statistics:
Tomarken and Waller also offer the following advice for avoiding overreliance on global fit statistics:
Let’s return to the example above about fear of global warming (FoGWar) and concern for animal welfare (CoAWe). How might we test local fit? First, let me mess up the model a bit to make it complex so we can encounter fit challenges. :-)
In addition, let’s make the structural model a bit more complex by testing the hypothesis that the effects of FoGWar and CoAWe on worldwide governmental spending on the environment (observed, in USD) are mediated by charitable donations. That is, we believe the environmental NGOs (charities) influence governments through lobbying and the development of legislation. Thus, if these NGOs have a larger budget, they can effect greater change in governmental spending.
demo.model <- '
l_FoGWar =~ .8*FoGWar1 + .8*FoGWar2 + .8*FoGWar3 + .6*FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ .6*CoAWe5 + .8*CoAWe6 + .8*CoAWe7 + .8*CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_FoGWar =~ .6*CoAWe5 #cross-loading
l_CoAWe =~ .6*FoGWar4 #cross-loading
donations ~ 0.3*l_FoGWar + 0.3*l_CoAWe
gspending ~ 0.5*donations + 0.3*l_FoGWar + 0.3*l_CoAWe
l_FoGWar ~~ 0.4*l_CoAWe
#equal (unit) residual variance
FoGWar1 ~~ 1*FoGWar1
FoGWar2 ~~ 1*FoGWar2
FoGWar3 ~~ 1*FoGWar3
FoGWar4 ~~ 1*FoGWar4
CoAWe5 ~~ 1*CoAWe5
CoAWe6 ~~ 1*CoAWe6
CoAWe7 ~~ 1*CoAWe7
CoAWe8 ~~ 1*CoAWe8
'
# generate data; note, standardized lv is default
simData <- lavaan::simulateData(demo.model, sample.nobs=500)
round(cor(simData), 2)
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe5 CoAWe6 CoAWe7 CoAWe8
## FoGWar1 1.00 0.38 0.36 0.43 0.40 0.20 0.25 0.18
## FoGWar2 0.38 1.00 0.35 0.35 0.43 0.23 0.21 0.13
## FoGWar3 0.36 0.35 1.00 0.38 0.36 0.22 0.24 0.16
## FoGWar4 0.43 0.35 0.38 1.00 0.50 0.38 0.44 0.33
## CoAWe5 0.40 0.43 0.36 0.50 1.00 0.42 0.45 0.38
## CoAWe6 0.20 0.23 0.22 0.38 0.42 1.00 0.39 0.37
## CoAWe7 0.25 0.21 0.24 0.44 0.45 0.39 1.00 0.41
## CoAWe8 0.18 0.13 0.16 0.33 0.38 0.37 0.41 1.00
## donations 0.23 0.24 0.21 0.36 0.34 0.26 0.28 0.25
## gspending 0.28 0.31 0.33 0.40 0.42 0.31 0.35 0.27
## donations gspending
## FoGWar1 0.23 0.28
## FoGWar2 0.24 0.31
## FoGWar3 0.21 0.33
## FoGWar4 0.36 0.40
## CoAWe5 0.34 0.42
## CoAWe6 0.26 0.31
## CoAWe7 0.28 0.35
## CoAWe8 0.25 0.27
## donations 1.00 0.61
## gspending 0.61 1.00
Let’s test the same model as above, but in the new dataset, and with the addition of gspending
. We will start with the model having simple structure (no cross-loadings for FoGWar or CoAWe).
msem.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
donations ~ l_FoGWar + l_CoAWe #predict criterion based on SEM
gspending ~ donations
l_FoGWar ~~ l_CoAWe
'
msem <- sem(msem.syntax, simData)
semPaths_default(msem)
inspect(msem)
## $lambda
## l_FGWr l_CoAW dontns gspndn
## FoGWar1 0 0 0 0
## FoGWar2 1 0 0 0
## FoGWar3 2 0 0 0
## FoGWar4 3 0 0 0
## CoAWe5 0 0 0 0
## CoAWe6 0 4 0 0
## CoAWe7 0 5 0 0
## CoAWe8 0 6 0 0
## donations 0 0 0 0
## gspending 0 0 0 0
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe5 CoAWe6 CoAWe7 CoAWe8 dontns
## FoGWar1 11
## FoGWar2 0 12
## FoGWar3 0 0 13
## FoGWar4 0 0 0 14
## CoAWe5 0 0 0 0 15
## CoAWe6 0 0 0 0 0 16
## CoAWe7 0 0 0 0 0 0 17
## CoAWe8 0 0 0 0 0 0 0 18
## donations 0 0 0 0 0 0 0 0 0
## gspending 0 0 0 0 0 0 0 0 0
## gspndn
## FoGWar1
## FoGWar2
## FoGWar3
## FoGWar4
## CoAWe5
## CoAWe6
## CoAWe7
## CoAWe8
## donations
## gspending 0
##
## $psi
## l_FGWr l_CoAW dontns gspndn
## l_FoGWar 21
## l_CoAWe 10 22
## donations 0 0 19
## gspending 0 0 0 20
##
## $beta
## l_FGWr l_CoAW dontns gspndn
## l_FoGWar 0 0 0 0
## l_CoAWe 0 0 0 0
## donations 7 8 0 0
## gspending 0 0 9 0
Any comments or concerns so far? We see that the latent factors for fear of global warming and concern for animal welfare predict individual donations, which in turn predict governmental spending. This fit with our hypotheses.
summary(msem, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 37 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 153.328
## Degrees of freedom 33
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 1405.036
## Degrees of freedom 45
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.912
## Tucker-Lewis Index (TLI) 0.879
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7899.614
## Loglikelihood unrestricted model (H1) -7822.950
##
## Number of free parameters 22
## Akaike (AIC) 15843.227
## Bayesian (BIC) 15935.949
## Sample-size adjusted Bayesian (BIC) 15866.119
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.085
## 90 Percent Confidence Interval 0.072 0.099
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.077
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.774 0.584
## FoGWar2 0.948 0.103 9.252 0.000 0.734 0.542
## FoGWar3 0.903 0.099 9.141 0.000 0.699 0.533
## FoGWar4 1.451 0.130 11.199 0.000 1.123 0.745
## l_CoAWe =~
## CoAWe5 1.000 1.099 0.761
## CoAWe6 0.682 0.060 11.394 0.000 0.750 0.578
## CoAWe7 0.715 0.058 12.267 0.000 0.786 0.626
## CoAWe8 0.623 0.059 10.539 0.000 0.685 0.532
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.326 0.194 1.683 0.092 0.253 0.222
## l_CoAWe 0.272 0.136 2.005 0.045 0.299 0.263
## gspending ~
## donations 0.766 0.045 17.159 0.000 0.766 0.609
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.712 0.081 8.837 0.000 0.836 0.836
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.156 0.085 13.594 0.000 1.156 0.659
## .FoGWar2 1.299 0.092 14.047 0.000 1.299 0.707
## .FoGWar3 1.233 0.087 14.130 0.000 1.233 0.716
## .FoGWar4 1.010 0.098 10.296 0.000 1.010 0.445
## .CoAWe5 0.878 0.086 10.221 0.000 0.878 0.421
## .CoAWe6 1.124 0.081 13.832 0.000 1.124 0.666
## .CoAWe7 0.960 0.073 13.245 0.000 0.960 0.608
## .CoAWe8 1.187 0.083 14.259 0.000 1.187 0.717
## .donations 1.010 0.067 14.984 0.000 1.010 0.783
## .gspending 1.285 0.081 15.811 0.000 1.285 0.629
## l_FoGWar 0.599 0.094 6.365 0.000 1.000 1.000
## l_CoAWe 1.208 0.136 8.856 0.000 1.000 1.000
In terms of global fit, we see that only SRMR (barely) suggests good fit. RMSEA is in the meh range (~ .08), but not terrible (> .10). CFI is not acceptable at .91. Even if these statistics were acceptable, it is usually worthwhile to examine the fit of individual components, such as the measurement model or specific factors.
What could be wrong? Let’s examine the FoGWar factor in isolation.
m1.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
'
m1 <- sem(m1.syntax, simData)
inspect(m1)
## $lambda
## l_FGWr
## FoGWar1 0
## FoGWar2 1
## FoGWar3 2
## FoGWar4 3
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4
## FoGWar1 4
## FoGWar2 0 5
## FoGWar3 0 0 6
## FoGWar4 0 0 0 7
##
## $psi
## l_FGWr
## l_FoGWar 8
semPaths_default(m1)
summary(m1, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 27 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 1.751
## Degrees of freedom 2
## P-value (Chi-square) 0.417
##
## Model test baseline model:
##
## Minimum Function Test Statistic 330.007
## Degrees of freedom 6
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.002
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3307.663
## Loglikelihood unrestricted model (H1) -3306.787
##
## Number of free parameters 8
## Akaike (AIC) 6631.325
## Bayesian (BIC) 6665.042
## Sample-size adjusted Bayesian (BIC) 6639.650
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent Confidence Interval 0.000 0.085
## P-value RMSEA <= 0.05 0.740
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.012
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.863 0.651
## FoGWar2 0.906 0.100 9.013 0.000 0.781 0.576
## FoGWar3 0.881 0.097 9.038 0.000 0.760 0.579
## FoGWar4 1.116 0.118 9.440 0.000 0.963 0.639
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.011 0.094 10.753 0.000 1.011 0.576
## .FoGWar2 1.228 0.099 12.434 0.000 1.228 0.668
## .FoGWar3 1.144 0.092 12.382 0.000 1.144 0.665
## .FoGWar4 1.345 0.121 11.073 0.000 1.345 0.592
## l_FoGWar 0.745 0.114 6.532 0.000 1.000 1.000
No evidence of misfit. All looks good.
m2.syntax <- '
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
'
m2 <- sem(m2.syntax, simData)
inspect(m2)
## $lambda
## l_CoAW
## CoAWe5 0
## CoAWe6 1
## CoAWe7 2
## CoAWe8 3
##
## $theta
## CoAWe5 CoAWe6 CoAWe7 CoAWe8
## CoAWe5 4
## CoAWe6 0 5
## CoAWe7 0 0 6
## CoAWe8 0 0 0 7
##
## $psi
## l_CoAW
## l_CoAWe 8
semPaths_default(m2)
summary(m2, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 25 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 0.987
## Degrees of freedom 2
## P-value (Chi-square) 0.611
##
## Model test baseline model:
##
## Minimum Function Test Statistic 380.404
## Degrees of freedom 6
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.008
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -3202.985
## Loglikelihood unrestricted model (H1) -3202.491
##
## Number of free parameters 8
## Akaike (AIC) 6421.969
## Bayesian (BIC) 6455.686
## Sample-size adjusted Bayesian (BIC) 6430.294
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent Confidence Interval 0.000 0.072
## P-value RMSEA <= 0.05 0.854
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.008
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_CoAWe =~
## CoAWe5 1.000 0.957 0.663
## CoAWe6 0.833 0.084 9.906 0.000 0.797 0.614
## CoAWe7 0.874 0.085 10.286 0.000 0.836 0.665
## CoAWe8 0.804 0.082 9.752 0.000 0.769 0.598
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .CoAWe5 1.170 0.106 11.050 0.000 1.170 0.561
## .CoAWe6 1.052 0.086 12.163 0.000 1.052 0.623
## .CoAWe7 0.880 0.080 10.979 0.000 0.880 0.557
## .CoAWe8 1.065 0.085 12.467 0.000 1.065 0.643
## l_CoAWe 0.916 0.133 6.889 0.000 1.000 1.000
Everything is excellent here. No cause for alarm.
Let’s see if part of the misfit reflects the difficulty of modeling FoGWar and CoAWe together. Note that the fits of each factor alone don’t consider the possibility that there is important variation among indicators across factors, such as cross-loading or important residual covariance terms. By zooming in on a single factor, we can’t know anything about covariances across factors, either at the indicator level (residual covariance) or factor level (factor correlation).
m3.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_FoGWar ~~ l_CoAWe
'
m3 <- sem(m3.syntax, simData)
inspect(m3)
## $lambda
## l_FGWr l_CoAW
## FoGWar1 0 0
## FoGWar2 1 0
## FoGWar3 2 0
## FoGWar4 3 0
## CoAWe5 0 0
## CoAWe6 0 4
## CoAWe7 0 5
## CoAWe8 0 6
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe5 CoAWe6 CoAWe7 CoAWe8
## FoGWar1 8
## FoGWar2 0 9
## FoGWar3 0 0 10
## FoGWar4 0 0 0 11
## CoAWe5 0 0 0 0 12
## CoAWe6 0 0 0 0 0 13
## CoAWe7 0 0 0 0 0 0 14
## CoAWe8 0 0 0 0 0 0 0 15
##
## $psi
## l_FGWr l_CoAW
## l_FoGWar 16
## l_CoAWe 7 17
semPaths_default(m3)
summary(m3, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 32 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 73.440
## Degrees of freedom 19
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 994.023
## Degrees of freedom 28
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.944
## Tucker-Lewis Index (TLI) 0.917
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -6404.193
## Loglikelihood unrestricted model (H1) -6367.472
##
## Number of free parameters 17
## Akaike (AIC) 12842.385
## Bayesian (BIC) 12914.033
## Sample-size adjusted Bayesian (BIC) 12860.074
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.076
## 90 Percent Confidence Interval 0.058 0.094
## P-value RMSEA <= 0.05 0.010
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.049
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.781 0.589
## FoGWar2 0.944 0.102 9.251 0.000 0.737 0.544
## FoGWar3 0.904 0.098 9.182 0.000 0.706 0.538
## FoGWar4 1.425 0.128 11.101 0.000 1.113 0.738
## l_CoAWe =~
## CoAWe5 1.000 1.105 0.765
## CoAWe6 0.677 0.060 11.252 0.000 0.748 0.576
## CoAWe7 0.710 0.059 12.109 0.000 0.785 0.625
## CoAWe8 0.615 0.059 10.373 0.000 0.679 0.528
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.721 0.081 8.860 0.000 0.836 0.836
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.146 0.085 13.432 0.000 1.146 0.653
## .FoGWar2 1.295 0.093 13.949 0.000 1.295 0.705
## .FoGWar3 1.223 0.087 14.005 0.000 1.223 0.711
## .FoGWar4 1.034 0.100 10.330 0.000 1.034 0.455
## .CoAWe5 0.865 0.088 9.877 0.000 0.865 0.415
## .CoAWe6 1.128 0.082 13.768 0.000 1.128 0.669
## .CoAWe7 0.963 0.073 13.151 0.000 0.963 0.610
## .CoAWe8 1.195 0.084 14.230 0.000 1.195 0.721
## l_FoGWar 0.610 0.095 6.391 0.000 1.000 1.000
## l_CoAWe 1.221 0.138 8.838 0.000 1.000 1.000
Yep, something is not quite right in the measurement model. Note the CFI is slightly below .95 and RMSEA is greater than .06. The model \(\chi^2\) is also quite high relative to the \(df\).
Let’s examine the modification indices. Here, I will only consider \(\chi^2\) improvements of > 2 points, and I arrange the MIs in descending order (print first 8 only):
modificationindices(m3) %>% filter(mi > 2) %>% arrange(desc(mi)) %>% head(n=8)
lhs | op | rhs | mi | epc | sepc.lv | sepc.all | sepc.nox |
---|---|---|---|---|---|---|---|
l_FoGWar | =~ | CoAWe5 | 27.59 | 1.640 | 1.280 | 0.886 | 0.886 |
l_CoAWe | =~ | FoGWar4 | 20.29 | 1.084 | 1.198 | 0.795 | 0.795 |
FoGWar2 | ~~ | CoAWe5 | 16.26 | 0.251 | 0.251 | 0.128 | 0.128 |
l_FoGWar | =~ | CoAWe8 | 11.38 | -0.778 | -0.608 | -0.472 | -0.472 |
FoGWar2 | ~~ | FoGWar4 | 10.34 | -0.249 | -0.249 | -0.122 | -0.122 |
FoGWar2 | ~~ | CoAWe8 | 10.24 | -0.198 | -0.198 | -0.113 | -0.113 |
CoAWe7 | ~~ | CoAWe8 | 9.89 | 0.182 | 0.182 | 0.112 | 0.112 |
FoGWar4 | ~~ | CoAWe7 | 7.16 | 0.160 | 0.160 | 0.084 | 0.084 |
The cross-loading of CoAWe5 onto l_FoGWar is clearly the most troubling problem in our model. If we include this cross-loading, the model \(\chi^2\) would improve by 27.59 points. We could also consider residual correlations among some of the FoGWar and CoAWe items if these were substantively motivated (e.g., specific similarities in content or wording). Whether we should dump CoAWe5 altogether (bad item) or allow it to cross-load is a substantive consideration that would depend on the goal of the analysis. Dropping the item would not cause problems with identification and would ease interpretability by maintaining simple structure. Here are the fits for the alternatives.
m4.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 3 items
l_FoGWar ~~ l_CoAWe
'
m4 <- sem(m4.syntax, simData)
inspect(m4)
## $lambda
## l_FGWr l_CoAW
## FoGWar1 0 0
## FoGWar2 1 0
## FoGWar3 2 0
## FoGWar4 3 0
## CoAWe6 0 0
## CoAWe7 0 4
## CoAWe8 0 5
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8
## FoGWar1 7
## FoGWar2 0 8
## FoGWar3 0 0 9
## FoGWar4 0 0 0 10
## CoAWe6 0 0 0 0 11
## CoAWe7 0 0 0 0 0 12
## CoAWe8 0 0 0 0 0 0 13
##
## $psi
## l_FGWr l_CoAW
## l_FoGWar 14
## l_CoAWe 6 15
semPaths_default(m4)
summary(m4, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 36 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 39.972
## Degrees of freedom 13
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 701.420
## Degrees of freedom 21
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.960
## Tucker-Lewis Index (TLI) 0.936
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -5640.465
## Loglikelihood unrestricted model (H1) -5620.479
##
## Number of free parameters 15
## Akaike (AIC) 11310.930
## Bayesian (BIC) 11374.150
## Sample-size adjusted Bayesian (BIC) 11326.539
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.064
## 90 Percent Confidence Interval 0.042 0.088
## P-value RMSEA <= 0.05 0.133
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.042
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.784 0.592
## FoGWar2 0.905 0.104 8.721 0.000 0.709 0.523
## FoGWar3 0.906 0.101 8.934 0.000 0.710 0.541
## FoGWar4 1.436 0.137 10.449 0.000 1.126 0.747
## l_CoAWe =~
## CoAWe6 1.000 0.793 0.610
## CoAWe7 1.087 0.113 9.605 0.000 0.861 0.686
## CoAWe8 0.937 0.105 8.966 0.000 0.743 0.577
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.440 0.061 7.194 0.000 0.708 0.708
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.141 0.088 12.909 0.000 1.141 0.650
## .FoGWar2 1.336 0.097 13.817 0.000 1.336 0.727
## .FoGWar3 1.217 0.089 13.609 0.000 1.217 0.707
## .FoGWar4 1.005 0.110 9.096 0.000 1.005 0.442
## .CoAWe6 1.059 0.089 11.923 0.000 1.059 0.628
## .CoAWe7 0.837 0.084 9.998 0.000 0.837 0.530
## .CoAWe8 1.105 0.088 12.568 0.000 1.105 0.667
## l_FoGWar 0.614 0.099 6.230 0.000 1.000 1.000
## l_CoAWe 0.628 0.101 6.191 0.000 1.000 1.000
We now see reasonable, though not perfect, fit on all practical measures. Thus, it appears that dropping this item largely resolves woes in the measurement model component.
m5.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe5 + CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 3 items
l_FoGWar =~ CoAWe5 #allow cross-loading
l_FoGWar ~~ l_CoAWe
'
m5 <- sem(m5.syntax, simData)
inspect(m5)
## $lambda
## l_FGWr l_CoAW
## FoGWar1 0 0
## FoGWar2 1 0
## FoGWar3 2 0
## FoGWar4 3 0
## CoAWe5 7 0
## CoAWe6 0 4
## CoAWe7 0 5
## CoAWe8 0 6
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe5 CoAWe6 CoAWe7 CoAWe8
## FoGWar1 9
## FoGWar2 0 10
## FoGWar3 0 0 11
## FoGWar4 0 0 0 12
## CoAWe5 0 0 0 0 13
## CoAWe6 0 0 0 0 0 14
## CoAWe7 0 0 0 0 0 0 15
## CoAWe8 0 0 0 0 0 0 0 16
##
## $psi
## l_FGWr l_CoAW
## l_FoGWar 17
## l_CoAWe 8 18
semPaths_default(m5)
summary(m5, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 44 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 47.988
## Degrees of freedom 18
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 994.023
## Degrees of freedom 28
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.969
## Tucker-Lewis Index (TLI) 0.952
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -6391.466
## Loglikelihood unrestricted model (H1) -6367.472
##
## Number of free parameters 18
## Akaike (AIC) 12818.933
## Bayesian (BIC) 12894.796
## Sample-size adjusted Bayesian (BIC) 12837.663
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.058
## 90 Percent Confidence Interval 0.038 0.078
## P-value RMSEA <= 0.05 0.238
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.039
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.792 0.598
## FoGWar2 0.962 0.100 9.585 0.000 0.762 0.562
## FoGWar3 0.900 0.096 9.348 0.000 0.713 0.543
## FoGWar4 1.366 0.122 11.156 0.000 1.082 0.718
## l_CoAWe =~
## CoAWe5 1.000 0.487 0.337
## CoAWe6 1.642 0.384 4.271 0.000 0.800 0.616
## CoAWe7 1.742 0.407 4.284 0.000 0.849 0.675
## CoAWe8 1.539 0.362 4.249 0.000 0.750 0.583
## l_FoGWar =~
## CoAWe5 0.850 0.146 5.818 0.000 0.673 0.466
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.269 0.067 4.045 0.000 0.697 0.697
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.128 0.084 13.351 0.000 1.128 0.642
## .FoGWar2 1.257 0.091 13.779 0.000 1.257 0.684
## .FoGWar3 1.213 0.087 13.976 0.000 1.213 0.705
## .FoGWar4 1.101 0.100 11.002 0.000 1.101 0.485
## .CoAWe5 0.938 0.079 11.869 0.000 0.938 0.450
## .CoAWe6 1.047 0.084 12.387 0.000 1.047 0.620
## .CoAWe7 0.858 0.078 11.043 0.000 0.858 0.544
## .CoAWe8 1.093 0.084 12.952 0.000 1.093 0.660
## l_FoGWar 0.628 0.096 6.513 0.000 1.000 1.000
## l_CoAWe 0.237 0.105 2.264 0.024 1.000 1.000
Allowing for this cross-loading also solves the fit problems with our measurement model. What to do?
For now, let’s enforce simple structure and drop CoAWe5 from further consideration.
Perhaps we also have problems in the structural model relating FoGWar and CoAWe to governmental spending and charitable donations. The informal way to handle this might be to retain factor scores from a well-fitting measurement model and conduct a simple path analysis to identify misfit. This somewhat flawed approach is still conceptually revealing since it reminds us that we are trying to assess structural relationships among variables, both latent and observed.
Let’s extract the factor scores from the measurement model in which we dropped CoAWe5 (the problematic item).
m6.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 3 items
l_FoGWar ~~ l_CoAWe
'
m6 <- sem(m6.syntax, simData)
#summary(m6, fit.measures=TRUE)
fscores <- lavPredict(m6)
simData_augment <- bind_cols(simData, data.frame(fscores))
mstruct.syntax <- '
donations ~ l_FoGWar + l_CoAWe #predict criterion based on saved factors scores
gspending ~ donations #donations to NGOs predict governmental spending
'
mstruct <- sem(mstruct.syntax, simData_augment)
semPaths_default(mstruct)
summary(mstruct, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 16 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 66.921
## Degrees of freedom 2
## P-value (Chi-square) 0.000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 393.685
## Degrees of freedom 5
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.833
## Tucker-Lewis Index (TLI) 0.582
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2216.116
## Loglikelihood unrestricted model (H1) -2182.655
##
## Number of free parameters 5
## Akaike (AIC) 4442.231
## Bayesian (BIC) 4463.304
## Sample-size adjusted Bayesian (BIC) 4447.434
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.255
## 90 Percent Confidence Interval 0.205 0.309
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.109
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.421 0.126 3.349 0.001 0.421 0.253
## l_CoAWe 0.305 0.128 2.391 0.017 0.305 0.181
## gspending ~
## donations 0.766 0.045 17.159 0.000 0.766 0.609
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .donations 1.066 0.067 15.811 0.000 1.066 0.827
## .gspending 1.285 0.081 15.811 0.000 1.285 0.629
We see from this analysis that the structural model (using saved factor scores) fits quite badly. These global fit statistics would never be considered acceptable and suggest that we cannot trust the structural parameter estimates. This is problematic since our hypotheses focused primarily on how fear of global warming and concern for animal welfare predict individual charitable contributions, which in turn influences governmental spending. We only have 2 \(df\) to play with, so what can we do?!
modificationIndices(mstruct)
lhs | op | rhs | mi | epc | sepc.lv | sepc.all | sepc.nox | |
---|---|---|---|---|---|---|---|---|
9 | donations | ~~ | gspending | 62.46 | -0.993 | -0.993 | -0.612 | -0.612 |
10 | donations | ~ | gspending | 62.46 | -0.773 | -0.773 | -0.973 | -0.973 |
11 | gspending | ~ | l_FoGWar | 60.05 | 0.631 | 0.631 | 0.301 | 0.441 |
12 | gspending | ~ | l_CoAWe | 52.36 | 0.594 | 0.594 | 0.279 | 0.416 |
14 | l_FoGWar | ~ | gspending | 7.70 | 0.041 | 0.041 | 0.085 | 0.085 |
17 | l_CoAWe | ~ | gspending | 1.62 | 0.018 | 0.018 | 0.039 | 0.039 |
Watch out for those first two… That would yield a non-recursive model (bidirectional relationships between donations
and gspending
), which is rarely a good approach in SEM. But we do see a couple of more plausible candidates. As our path diagram reveals, we have assumed that the effects of FoGWar and CoAWe on governmental spending are entirely mediated by donations to charitable organizations. What if there are other reasons for the charity-government spending association? There could be unmeasured mediators, but our model enforces zero residual covariance of l_FoGWar
and l_CoAWe
with gspending
after accounting for effects on charitable donations (donations
).
Let’s consider the possibility that there is partial, but not full, mediation. To keep the structural model in overidentified territory, we can free just the largest MI and examine global fit.
mstruct2.syntax <- '
donations ~ l_FoGWar + l_CoAWe #predict criterion based on saved factors scores
gspending ~ donations #donations to NGOs predict governmental spending
gspending ~ l_CoAWe #allow direct effect of CoAWe on governmental spending
'
mstruct2 <- sem(mstruct2.syntax, simData_augment)
semPaths_default(mstruct2)
summary(mstruct2, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 19 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 11.616
## Degrees of freedom 1
## P-value (Chi-square) 0.001
##
## Model test baseline model:
##
## Minimum Function Test Statistic 393.685
## Degrees of freedom 5
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.973
## Tucker-Lewis Index (TLI) 0.863
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2188.463
## Loglikelihood unrestricted model (H1) -2182.655
##
## Number of free parameters 6
## Akaike (AIC) 4388.926
## Bayesian (BIC) 4414.214
## Sample-size adjusted Bayesian (BIC) 4395.169
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.146
## 90 Percent Confidence Interval 0.079 0.226
## P-value RMSEA <= 0.05 0.011
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.019
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.421 0.126 3.349 0.001 0.421 0.253
## l_CoAWe 0.305 0.128 2.391 0.017 0.305 0.181
## gspending ~
## donations 0.628 0.046 13.661 0.000 0.628 0.499
## l_CoAWe 0.594 0.078 7.647 0.000 0.594 0.279
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .donations 1.066 0.067 15.811 0.000 1.066 0.827
## .gspending 1.151 0.073 15.811 0.000 1.151 0.563
We have moved in the right direction, with acceptable values of SRMR and CFI, but not TLI or RMSEA. Let’s examine the just-identified model in which both mediators have a direct effect:
mstruct3.syntax <- '
donations ~ l_FoGWar + l_CoAWe #predict criterion based on saved factors scores
gspending ~ donations #donations to NGOs predict governmental spending
gspending ~ l_FoGWar + l_CoAWe #allow direct effects of both FoGWar and CoAWe on governmental spending
'
mstruct3 <- sem(mstruct3.syntax, simData_augment)
semPaths_default(mstruct3)
summary(mstruct3, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 22 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 0.000
## Degrees of freedom 0
## Minimum Function Value 0.0000000000000
##
## Model test baseline model:
##
## Minimum Function Test Statistic 393.685
## Degrees of freedom 5
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.000
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2182.655
## Loglikelihood unrestricted model (H1) -2182.655
##
## Number of free parameters 7
## Akaike (AIC) 4379.310
## Bayesian (BIC) 4408.813
## Sample-size adjusted Bayesian (BIC) 4386.594
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent Confidence Interval 0.000 0.000
## P-value RMSEA <= 0.05 NA
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.000
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## donations ~
## l_FoGWar 0.421 0.126 3.349 0.001 0.421 0.253
## l_CoAWe 0.305 0.128 2.391 0.017 0.305 0.181
## gspending ~
## donations 0.605 0.046 13.161 0.000 0.605 0.480
## l_FoGWar 0.448 0.131 3.428 0.001 0.448 0.214
## l_CoAWe 0.227 0.132 1.719 0.086 0.227 0.107
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .donations 1.066 0.067 15.811 0.000 1.066 0.827
## .gspending 1.124 0.071 15.811 0.000 1.124 0.551
By definition, we have achieved perfect fit, but notice that both FoGWar and CoAWe have strong relationships with governmental spending after accounting for charitable donations. We should not ignore these findings if we want the overall SEM to fit well and to trust the mediated effects of interest. Thus, we should strongly consider the saturated structural model.
As I mentioned above, extracting the factors scores is intuitive, and in some ways sensible, but it does not follow the SEM principle of trying to estimate all important effects, both measurement and structural, in a single model. In particular, we lose any information about covariances among the indicators of CoAWe and FoGWar with the outcomes of charitable donations and governemental spending. If there are strong residual associations after accounting for the factor model, we would entirely miss these by extracting fator scores.
Thus, to formally test the local fit of our candidate structural model (full mediation), we can leave the measurement model alone (drop CoAWe5, as above), but test an alternative in which we saturate the structural model only. That is, we can consider how well a model fits when l_FoGWar
, l_CoAWe
, donations
, and gspending
are allowed to correlate freely with each other.
This leads to a tricky situation, however, because in SEM (at least conventional LISREL all-y notation), there is no matrix that represents the correlations among observed and latent variables. Rather, \(\boldsymbol{\Theta}_\varepsilon\) contains the (residual) covariance structure for the observed variables and \(\boldsymbol{\Psi}\) contains the covariance structure of the latent variables.
When we regress observed variables (e.g., gspending
) on latent variables (e.g., l_FoGWar
) above, lavaan
converts gspending
into a single-indicator latent variable without our awareness. This allows the residual variance of gspending
to be incorporated into \(\boldsymbol{\Psi}\) while the regression coefficient is in \(\boldsymbol{B}\). Here, we need to create the single-indicator latent variable ourselves. This is done by defining a latent variable with a fixed 1.0 loading onto the observed variable. Note that this does not increase our number of free parameters because the loading is fixed. It is more of a trick to get the matrix math to work properly.
mstruct_saturated.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 3 items
l_gspending =~ 1*gspending #could omit 1* since that is the default; left here for clarity
l_donations =~ 1*donations
#saturate structural model with undirected relationships
#4 choose 2 = 6 parameters
l_donations ~~ l_FoGWar
l_donations ~~ l_CoAWe
l_donations ~~ l_gspending
l_gspending ~~ l_FoGWar
l_gspending ~~ l_CoAWe
l_FoGWar ~~ l_CoAWe
'
mstruct_saturated <- sem(mstruct_saturated.syntax, simData)
semPaths_default(mstruct_saturated, sizeMan=10)
inspect(mstruct_saturated)
## $lambda
## l_FGWr l_CoAW l_gspn l_dntn
## FoGWar1 0 0 0 0
## FoGWar2 1 0 0 0
## FoGWar3 2 0 0 0
## FoGWar4 3 0 0 0
## CoAWe6 0 0 0 0
## CoAWe7 0 4 0 0
## CoAWe8 0 5 0 0
## gspending 0 0 0 0
## donations 0 0 0 0
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 12
## FoGWar2 0 13
## FoGWar3 0 0 14
## FoGWar4 0 0 0 15
## CoAWe6 0 0 0 0 16
## CoAWe7 0 0 0 0 0 17
## CoAWe8 0 0 0 0 0 0 18
## gspending 0 0 0 0 0 0 0 0
## donations 0 0 0 0 0 0 0 0 0
##
## $psi
## l_FGWr l_CoAW l_gspn l_dntn
## l_FoGWar 19
## l_CoAWe 11 20
## l_gspending 9 10 21
## l_donations 6 7 8 22
Notice how the $psi
matrix is constructed.
What about fit?
summary(mstruct_saturated, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 38 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 48.342
## Degrees of freedom 23
## P-value (Chi-square) 0.002
##
## Model test baseline model:
##
## Minimum Function Test Statistic 1103.651
## Degrees of freedom 36
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.976
## Tucker-Lewis Index (TLI) 0.963
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7104.518
## Loglikelihood unrestricted model (H1) -7080.347
##
## Number of free parameters 22
## Akaike (AIC) 14253.036
## Bayesian (BIC) 14345.758
## Sample-size adjusted Bayesian (BIC) 14275.928
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.047
## 90 Percent Confidence Interval 0.028 0.065
## P-value RMSEA <= 0.05 0.580
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.035
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.774 0.584
## FoGWar2 0.932 0.104 8.952 0.000 0.722 0.532
## FoGWar3 0.930 0.102 9.148 0.000 0.720 0.549
## FoGWar4 1.444 0.134 10.787 0.000 1.118 0.742
## l_CoAWe =~
## CoAWe6 1.000 0.794 0.611
## CoAWe7 1.083 0.109 9.977 0.000 0.860 0.684
## CoAWe8 0.936 0.102 9.197 0.000 0.743 0.578
## l_gspending =~
## gspending 1.000 1.429 1.000
## l_donations =~
## donations 1.000 1.136 1.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_donations 0.388 0.056 6.935 0.000 0.442 0.442
## l_CoAWe ~~
## l_donations 0.378 0.057 6.573 0.000 0.419 0.419
## l_gspending ~~
## l_donations 0.988 0.085 11.628 0.000 0.609 0.609
## l_FoGWar ~~
## l_gspending 0.606 0.076 7.956 0.000 0.548 0.548
## l_CoAWe ~~
## l_gspending 0.568 0.076 7.454 0.000 0.501 0.501
## l_FoGWar ~~
## l_CoAWe 0.434 0.060 7.267 0.000 0.706 0.706
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.157 0.087 13.325 0.000 1.157 0.659
## .FoGWar2 1.317 0.095 13.930 0.000 1.317 0.717
## .FoGWar3 1.203 0.087 13.755 0.000 1.203 0.699
## .FoGWar4 1.023 0.103 9.882 0.000 1.023 0.450
## .CoAWe6 1.056 0.087 12.194 0.000 1.056 0.626
## .CoAWe7 0.840 0.081 10.432 0.000 0.840 0.532
## .CoAWe8 1.103 0.086 12.798 0.000 1.103 0.666
## .gspending 0.000 0.000 0.000
## .donations 0.000 0.000 0.000
## l_FoGWar 0.599 0.096 6.259 0.000 1.000 1.000
## l_CoAWe 0.630 0.100 6.317 0.000 1.000 1.000
## l_gspending 2.042 0.129 15.811 0.000 1.000 1.000
## l_donations 1.289 0.082 15.811 0.000 1.000 1.000
It’s not perfect, but looks pretty good – all practical fit indices are in acceptable ranges. Thus, in the full SEM, saturating the structural model and fixing up the measurement model yields an acceptable solution.
The crucial test is how this model fits with our a priori hypothesis of mediation. We can use an LRT to test this because the restricted model is nested within the saturated (structural) model. To get the LRT right (i.e., exactly comparable models), we should use the single-indicator approach to the original model explicitly, rather than allowing lavaan to change the parameters behind the scene.
msem.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_gspending =~ 1*gspending #could omit 1* since that is the default; left here for clarity
l_donations =~ 1*donations
l_donations ~ l_FoGWar + l_CoAWe #predict criterion based on SEM
l_gspending ~ l_donations #donations predict governmental spending
l_FoGWar ~~ l_CoAWe #correlated factors
'
msem <- sem(msem.syntax, simData)
semPaths_default(msem)
inspect(msem)
## $lambda
## l_FGWr l_CoAW l_gspn l_dntn
## FoGWar1 0 0 0 0
## FoGWar2 1 0 0 0
## FoGWar3 2 0 0 0
## FoGWar4 3 0 0 0
## CoAWe6 0 0 0 0
## CoAWe7 0 4 0 0
## CoAWe8 0 5 0 0
## gspending 0 0 0 0
## donations 0 0 0 0
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 10
## FoGWar2 0 11
## FoGWar3 0 0 12
## FoGWar4 0 0 0 13
## CoAWe6 0 0 0 0 14
## CoAWe7 0 0 0 0 0 15
## CoAWe8 0 0 0 0 0 0 16
## gspending 0 0 0 0 0 0 0 0
## donations 0 0 0 0 0 0 0 0 0
##
## $psi
## l_FGWr l_CoAW l_gspn l_dntn
## l_FoGWar 17
## l_CoAWe 9 18
## l_gspending 0 0 19
## l_donations 0 0 0 20
##
## $beta
## l_FGWr l_CoAW l_gspn l_dntn
## l_FoGWar 0 0 0 0
## l_CoAWe 0 0 0 0
## l_gspending 0 0 0 8
## l_donations 6 7 0 0
anova(mstruct_saturated, msem)
Df | AIC | BIC | Chisq | Chisq diff | Df diff | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|
mstruct_saturated | 23 | 14253 | 14346 | 48.3 | NA | NA | NA |
msem | 25 | 14316 | 14400 | 115.3 | 67 | 2 | 0 |
We see that the LRT is highly significant and the AIC and BIC for the saturated model is much better than the restricted mediation model. Thus, the hypothesis that charitable donations fully mediates the relationship of FoGWar and CoAWe with governmental spending is not supported.
What about the alternative that the mediation can be partial, and that FoGWar and CoAWe also have direct effects?
msem_partial.syntax <- '
l_FoGWar =~ FoGWar1 + FoGWar2 + FoGWar3 + FoGWar4 #definition of factor FoGWar with loadings on 4 items
l_CoAWe =~ CoAWe6 + CoAWe7 + CoAWe8 #definition of factor CoAWe with loadings on 4 items
l_gspending =~ 1*gspending #could omit 1* since that is the default; left here for clarity
l_donations =~ 1*donations
l_donations ~ l_FoGWar + l_CoAWe #predict criterion based on SEM
l_gspending ~ l_donations + l_FoGWar + l_CoAWe #all three predict spending
l_FoGWar ~~ l_CoAWe #correlated factors
'
msem_partial <- sem(msem_partial.syntax, simData)
semPaths_default(msem_partial)
inspect(msem_partial)
## $lambda
## l_FGWr l_CoAW l_gspn l_dntn
## FoGWar1 0 0 0 0
## FoGWar2 1 0 0 0
## FoGWar3 2 0 0 0
## FoGWar4 3 0 0 0
## CoAWe6 0 0 0 0
## CoAWe7 0 4 0 0
## CoAWe8 0 5 0 0
## gspending 0 0 0 0
## donations 0 0 0 0
##
## $theta
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 12
## FoGWar2 0 13
## FoGWar3 0 0 14
## FoGWar4 0 0 0 15
## CoAWe6 0 0 0 0 16
## CoAWe7 0 0 0 0 0 17
## CoAWe8 0 0 0 0 0 0 18
## gspending 0 0 0 0 0 0 0 0
## donations 0 0 0 0 0 0 0 0 0
##
## $psi
## l_FGWr l_CoAW l_gspn l_dntn
## l_FoGWar 19
## l_CoAWe 11 20
## l_gspending 0 0 21
## l_donations 0 0 0 22
##
## $beta
## l_FGWr l_CoAW l_gspn l_dntn
## l_FoGWar 0 0 0 0
## l_CoAWe 0 0 0 0
## l_gspending 9 10 0 8
## l_donations 6 7 0 0
summary(msem_partial, fit.measures=TRUE, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 49 iterations
##
## Number of observations 500
##
## Estimator ML
## Minimum Function Test Statistic 48.342
## Degrees of freedom 23
## P-value (Chi-square) 0.002
##
## Model test baseline model:
##
## Minimum Function Test Statistic 1103.651
## Degrees of freedom 36
## P-value 0.000
##
## User model versus baseline model:
##
## Comparative Fit Index (CFI) 0.976
## Tucker-Lewis Index (TLI) 0.963
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -7104.518
## Loglikelihood unrestricted model (H1) -7080.347
##
## Number of free parameters 22
## Akaike (AIC) 14253.036
## Bayesian (BIC) 14345.758
## Sample-size adjusted Bayesian (BIC) 14275.928
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.047
## 90 Percent Confidence Interval 0.028 0.065
## P-value RMSEA <= 0.05 0.580
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.035
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar =~
## FoGWar1 1.000 0.774 0.584
## FoGWar2 0.932 0.104 8.952 0.000 0.722 0.532
## FoGWar3 0.930 0.102 9.148 0.000 0.720 0.549
## FoGWar4 1.444 0.134 10.787 0.000 1.118 0.742
## l_CoAWe =~
## CoAWe6 1.000 0.794 0.611
## CoAWe7 1.083 0.109 9.977 0.000 0.860 0.684
## CoAWe8 0.936 0.102 9.197 0.000 0.743 0.578
## l_gspending =~
## gspending 1.000 1.429 1.000
## l_donations =~
## donations 1.000 1.136 1.000
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_donations ~
## l_FoGWar 0.427 0.132 3.223 0.001 0.291 0.291
## l_CoAWe 0.305 0.131 2.336 0.020 0.214 0.214
## l_gspending ~
## l_donations 0.551 0.049 11.142 0.000 0.438 0.438
## l_FoGWar 0.481 0.139 3.452 0.001 0.261 0.261
## l_CoAWe 0.240 0.135 1.781 0.075 0.133 0.133
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## l_FoGWar ~~
## l_CoAWe 0.434 0.060 7.267 0.000 0.706 0.706
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .FoGWar1 1.157 0.087 13.325 0.000 1.157 0.659
## .FoGWar2 1.317 0.095 13.930 0.000 1.317 0.717
## .FoGWar3 1.203 0.087 13.755 0.000 1.203 0.699
## .FoGWar4 1.023 0.103 9.882 0.000 1.023 0.450
## .CoAWe6 1.056 0.087 12.194 0.000 1.056 0.626
## .CoAWe7 0.840 0.081 10.432 0.000 0.840 0.532
## .CoAWe8 1.103 0.086 12.798 0.000 1.103 0.666
## .gspending 0.000 0.000 0.000
## .donations 0.000 0.000 0.000
## l_FoGWar 0.599 0.096 6.259 0.000 1.000 1.000
## l_CoAWe 0.630 0.100 6.317 0.000 1.000 1.000
## .l_gspending 1.070 0.072 14.797 0.000 0.524 0.524
## .l_donations 1.008 0.068 14.744 0.000 0.782 0.782
Hmm, looks pretty gosh-darned similar to the model with a saturated structural component.
anova(msem_partial, mstruct_saturated)
Df | AIC | BIC | Chisq | Chisq diff | Df diff | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|
msem_partial | 23 | 14253 | 14346 | 48.3 | NA | NA | NA |
mstruct_saturated | 23 | 14253 | 14346 | 48.3 | 0 | 0 | 1 |
Ignore the p-value for a second and just notice the identical df
and AIC
. Likewise, here are the log-likelihood estimates for each:
cat("Partial mediation model:\n")
## Partial mediation model:
logLik(msem_partial)
## 'log Lik.' -7105 (df=22)
cat("\nSaturated structural model:\n")
##
## Saturated structural model:
logLik(mstruct_saturated)
## 'log Lik.' -7105 (df=22)
These are equivalent models, a topic we will spend more time on next week. For now, know that replacing a covariance term with a regression term yields identical fit. And recall that covariance is undirected. Thus, in our candidate model that allows for partial mediation, we believe that the effects are directed, but because there are strong associations among all 4 variables (FoGWar, CoAWe, gspending, and donations), we cannot constrain the model to be more specific. Therefore, there is no statistical evidence for our view of partial mediation relative to a model in which all structural relationships are undirected.
As noted above, modification indices provide direct feedback about potential omitted effects in the model. That is, they help to identify relationships (measurement or structural) that, if included, would improve model fit (in \(\chi^2\) terms).
However, examination of residuals and standardized residuals can provide additional, complementary, guidance about sources of misfit in the model. Recall that the goal of SEM is to provide a good representation of the covariance matrix with the fewest parameters possible. Thus, residuals are conceptualized in terms of observed vs. predicted values for specific covariance terms, as in RMR and SRMR.
We can obtain raw residuals using the resid()
function. In our earlier SEM with a saturated structural model and no CoAWe5, we would get:
resid(mstruct_saturated)
## $type
## [1] "raw"
##
## $cov
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 0.000
## FoGWar2 0.130 0.000
## FoGWar3 0.063 0.102 0.000
## FoGWar4 -0.015 -0.098 -0.057 0.000
## CoAWe6 -0.096 0.008 -0.037 0.124 0.000
## CoAWe7 -0.052 -0.083 -0.049 0.155 -0.040 0.000
## CoAWe8 -0.093 -0.158 -0.111 0.059 0.034 0.019 0.000
## gspending -0.068 0.035 0.052 -0.005 0.007 0.015 -0.031 0.000
## donations -0.041 0.013 -0.055 0.048 0.007 -0.015 0.015 0.000 0.000
##
## $mean
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe6 CoAWe7 CoAWe8
## 0 0 0 0 0 0 0
## gspending donations
## 0 0
Note that we get a vector of means (all zero), even though we didn’t fit mean structure. The values of each of these reflect the difference between the model-implied covariance matrix, \(\hat{\boldsymbol{\Sigma}}({\boldsymbol{\theta}})\), and the sample covariance matrix, \(\boldsymbol{S_{XX}}\).
The model-implied covariance is given by fitted()
:
fitted(mstruct_saturated)
## $cov
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 1.756
## FoGWar2 0.559 1.838
## FoGWar3 0.557 0.520 1.721
## FoGWar4 0.865 0.807 0.805 2.272
## CoAWe6 0.434 0.405 0.404 0.627 1.687
## CoAWe7 0.470 0.438 0.437 0.679 0.683 1.579
## CoAWe8 0.407 0.379 0.378 0.587 0.590 0.639 1.656
## gspending 0.606 0.565 0.564 0.876 0.568 0.615 0.532 2.042
## donations 0.388 0.362 0.361 0.561 0.378 0.409 0.354 0.988 1.289
##
## $mean
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe6 CoAWe7 CoAWe8
## 0 0 0 0 0 0 0
## gspending donations
## 0 0
And the sample covariance is given by cov()
:
cov(dplyr::select(simData, -CoAWe5))
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe6 CoAWe7 CoAWe8 donations
## FoGWar1 1.759 0.690 0.622 0.852 0.339 0.419 0.314 0.348
## FoGWar2 0.690 1.842 0.623 0.710 0.414 0.356 0.222 0.376
## FoGWar3 0.622 0.623 1.725 0.750 0.368 0.389 0.267 0.307
## FoGWar4 0.852 0.710 0.750 2.277 0.753 0.835 0.647 0.610
## CoAWe6 0.339 0.414 0.368 0.753 1.690 0.644 0.626 0.385
## CoAWe7 0.419 0.356 0.389 0.835 0.644 1.582 0.660 0.395
## CoAWe8 0.314 0.222 0.267 0.647 0.626 0.660 1.659 0.370
## donations 0.348 0.376 0.307 0.610 0.385 0.395 0.370 1.292
## gspending 0.540 0.602 0.618 0.873 0.577 0.632 0.502 0.990
## gspending
## FoGWar1 0.540
## FoGWar2 0.602
## FoGWar3 0.618
## FoGWar4 0.873
## CoAWe6 0.577
## CoAWe7 0.632
## CoAWe8 0.502
## donations 0.990
## gspending 2.046
The raw residuals are always worth examination, but sometimes if variables vary in scale, or the variances are far from 1, the covariances are hard to interpret. How much is too much in the residuals?
This is where standardized residuals play a useful role in diagnosing misfit. Standardization is performed by dividing the raw residuals the square root of the difference between the model-implied and observed statistics. Long story short, they are in z-score terms, but if you see NA
values, you can safely ignore them. We simply modify our resid()
call slightly to obtain these:
resid(mstruct_saturated, type="standardized")
## $type
## [1] "standardized"
##
## $cov
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 0.000
## FoGWar2 2.665 NA
## FoGWar3 1.417 2.052 NA
## FoGWar4 -0.524 -3.402 -1.993 0.000
## CoAWe6 -1.882 0.148 -0.692 2.450 NA
## CoAWe7 -1.133 -1.686 -1.041 3.407 -2.031 0.000
## CoAWe8 -1.776 -2.830 -2.081 1.149 1.004 0.743 0.000
## gspending -1.533 0.674 1.057 -0.144 0.179 0.457 -0.710 0.000
## donations -1.019 0.294 -1.297 1.497 0.180 -0.524 0.386 0.000 0.000
##
## $mean
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe6 CoAWe7 CoAWe8
## 0 0 0 0 0 0 0
## gspending donations
## 0 0
Large positive values (e.g., > 2) indicate that the model is underestimating the covariance term, whereas large negative values (e.g., < -2) indicate overestimation. This follows from regression, where residuals are \(Y - \hat{Y}\). Thus, standardized residuals given us direct information about which covariance terms are most misestimated.
When examining standardized residuals, remember that, as in null hypothesis significance testing, there is a chance of a false positive. That is, if you have 20 standardized residuals, you expect at least one of them to be ‘significant’ (\(|z| > 1.96\)) by chance even if the model is correct.
We can also simply convert raw residuals into correlational units (standardize variances to 1) if that is more intuitive. These units can give us an sense of misfit on a common scale, where we might consider values \(|r| > .10\) as troubling a priori (just a rule of thumb).
resid(mstruct_saturated, type="cor")
## $type
## [1] "cor.bollen"
##
## $cor
## FoGWr1 FoGWr2 FoGWr3 FoGWr4 CoAWe6 CoAWe7 CoAWe8 gspndn dontns
## FoGWar1 0.000
## FoGWar2 0.072 0.000
## FoGWar3 0.036 0.058 0.000
## FoGWar4 -0.007 -0.048 -0.029 0.000
## CoAWe6 -0.056 0.005 -0.022 0.064 0.000
## CoAWe7 -0.031 -0.049 -0.030 0.082 -0.024 0.000
## CoAWe8 -0.054 -0.091 -0.066 0.030 0.021 0.012 0.000
## gspending -0.036 0.018 0.028 -0.002 0.004 0.008 -0.017 0.000
## donations -0.027 0.009 -0.037 0.028 0.004 -0.010 0.010 0.000 0.000
##
## $mean
## FoGWar1 FoGWar2 FoGWar3 FoGWar4 CoAWe6 CoAWe7 CoAWe8
## 0 0 0 0 0 0 0
## gspending donations
## 0 0
Finally, examination of residuals can help to guide your search through the modification indices to see which may be both justifiable and most related to covariance misestimation that is central to your substantive question.
For more details on residuals, see: https://www.statmodel.com/download/StandardizedResiduals.pdf.