1 Natalia

1.1 Measurement invariance against a validated measure

Question: “Regarding measurement invariance, can you say more about how to conduct measurement invariance when the comparison group is the original validated measure? For example, if in my study, I am trying use a previously validated construct that is validated in a US sample, and using it for the first time in an Indian population, what are the guidelines for demonstrating measurement invariance here? We discussed model comparisons in the context of our own data (e.g., in a scenario where we have data from US and Indian populations) but not in the context of when we only have one of those (e.g., non-validated sample = Indian) and want to compare to previously validated samples that we did not collect, but that are available in the literature.”

Answer: If you have the original mean and covariance matrix, as well as the sample size, from the published paper, you may be able to hack together something workable. Remember that SEM models a mean vector and covariance matrix as its target. You can provide a covariance matrix to lavaan as an input.

#Brown book, Chapter 5

#input: the lower part of a correlation matrix, see p.169, Table 5.2 (the lisrel syntax)
cormat <- '
1.000
0.300 1.000
0.229 0.261 1.000
0.411 0.406 0.429 1.000
0.172 0.252 0.218 0.481 1.000
0.214 0.268 0.267 0.579 0.484 1.000
0.200 0.214 0.241 0.543 0.426 0.492 1.000
0.185 0.230 0.185 0.545 0.463 0.548 0.522 1.000
0.134 0.146 0.108 0.186 0.122 0.131 0.108 0.151 1.000
0.134 0.099 0.061 0.223 0.133 0.188 0.105 0.170 0.448 1.000
0.160 0.131 0.158 0.161 0.044 0.124 0.066 0.061 0.370 0.350 1.000
0.087 0.088 0.101 0.198 0.077 0.177 0.128 0.112 0.356 0.359 0.507 1.000'

#convert it to a square matrix
Cmat <- getCov(cormat)

# the standard deviations of the variables (Table 5.2)
neodat.sdev = c(2.06, 1.52, 1.92, 1.41, 1.73, 1.77, 2.49, 2.27, 2.68, 1.75, 2.57, 2.66)

# convert the correlation matrix to a covariance matrix (lavaan needs a covariance matrix as input)
covmat <- lavaan::cor2cov(Cmat, sds=neodat.sdev)

# assign row and column names to the covariance matrix
rownames(covmat) = c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10","X11","X12")
colnames(covmat)=rownames(covmat)               

#model set up - this is the model presented in Figure 1, p.168 
# you can play around with this model following the text of Ch.5
new.model <- '#X4 loads on both coping and social factors
              coping =~ X1 + X2 + X3 + X4   
              social =~ X5 + X6 + X7 + X8 + X4
              enhance =~ X9 + X10 + X11 + X12
              # residual correlation
              X11 ~~ X12'

#model fit
fit <- cfa(new.model, sample.cov=covmat, sample.nobs=500)
summary(fit)
## lavaan 0.6-3 ended normally after 60 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         29
## 
##   Number of observations                           500
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                      44.955
##   Degrees of freedom                                49
##   P-value (Chi-square)                           0.638
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   coping =~                                           
##     X1                1.000                           
##     X2                0.740    0.094    7.909    0.000
##     X3                0.933    0.118    7.903    0.000
##     X4                0.719    0.118    6.070    0.000
##   social =~                                           
##     X5                1.000                           
##     X6                1.209    0.093   13.049    0.000
##     X7                1.572    0.127   12.354    0.000
##     X8                1.518    0.118   12.870    0.000
##     X4                0.565    0.087    6.485    0.000
##   enhance =~                                          
##     X9                1.000                           
##     X10               0.648    0.070    9.293    0.000
##     X11               0.776    0.093    8.340    0.000
##     X12               0.802    0.096    8.327    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .X11 ~~                                              
##    .X12               1.460    0.300    4.873    0.000
##   coping ~~                                           
##     social            0.704    0.108    6.531    0.000
##     enhance           0.669    0.145    4.613    0.000
##   social ~~                                           
##     enhance           0.566    0.128    4.412    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .X1                3.117    0.230   13.546    0.000
##    .X2                1.694    0.125   13.527    0.000
##    .X3                2.705    0.200   13.536    0.000
##    .X4                0.454    0.070    6.502    0.000
##    .X5                1.794    0.130   13.835    0.000
##    .X6                1.384    0.115   12.015    0.000
##    .X7                3.240    0.248   13.089    0.000
##    .X8                2.393    0.194   12.352    0.000
##    .X9                3.958    0.400    9.895    0.000
##    .X10               1.710    0.170   10.063    0.000
##    .X11               4.657    0.371   12.545    0.000
##    .X12               4.997    0.398   12.561    0.000
##     coping            1.118    0.217    5.158    0.000
##     social            1.193    0.164    7.293    0.000
##     enhance           3.210    0.490    6.550    0.000

The downside of this approach is that you cannot look at higher moments of the data that can affect fit. In particular, information about skewness and kurtosis is lost when we fit a model based on a mean (first moment) and variance (second moment) structure. Thus, the use of MLR to help adjust our test statistics for non-normality cannot help us.

An alternative, and probably the better approach in general, is to contact the authors of the published paper to ask if they would share the data for this purpose. They may wish to be co-author on the paper, which is up for discussion. But in the era of open science, people are increasingly willing to share their data without insisting upon authorship.

1.2 Longitudinal and multilevel analysis

Question: “I am interested in learning more about how to apply SEM in a longitudinal multilevel context. For example, in a daily diary cross-lag panel design. What are the advantages to using SEM for these types of studies as compared to regular multilevel?”

Answer: The main advantage of SEM for multilevel questions is that it allows you to look at latent variables in nested data (e.g., longitudinal). Just as in GLM, if we have nested observations and do not account for this, our test statistics will be invalid.

Many multilevel models can be fit as-is within an SEM framework – that is, SEM software supports models that are truly the same as in the multilevel case using the lme4 or nlme packages. This means that if you are more comfortable thinking in SEM terms, it may be to your advantage to stay within lavaan or Mplus solely for consistency in representation.

The other main advantage of multilevel is that it uses latent decomposition to dissect a so-called ‘level 1’ variable (lower level) into between- versus within-cluster components. In MLM, this dissection is on your shoulders, usually requiring you to within-cluster center an L1 variable and also enter cluster means as an L2 predictor. Without parceling the L1 variable, your coefficients will represent the total effect, which combines between- and within-cluster effects. This is an often overlooked step in MLM that can drastically alter interpretations – people often assume that a significant effect of an L1 variable represents a within-cluster (e.g., within-person) effect. But this need not be the case. Read: Curran & Bauer (2011), Annual Review of Psychology for information about disaggregating variance across levels.

For reasons that go beyond this class, MSEM provides a more accurate parceling of variance between the levels. At the between-person level, L1 (within) variables are treated as latent factors that capture the person/cluster tendencies in the L1 process.