1 Finish local fit discussion

2 Problems with emphasizing global fit

Tomarken and Waller (2003) note the following problems with models that appear to fit well according to global statistics:

There are many equivalent models that fit equally well
There may be several alternative models that could fit better
Important variables may be omitted (that would affect fit)
Fit of lower-order components may be poor
An ill-fitting partition (component) may be masked within a well-fitting overall model
Global stats may be insensitive to particular kinds of misspecification (esp. in mean structure models)
One may make post hoc modifications to the model that reduce the validity and reproducibility of the results.

They also offer the following advice for avoiding overreliance on global fit statistics:

Acknowledge plausible equivalent models and design studies to rule out these alternatives
Compare the fit of a candidate model against nonequivalent alternatives (e.g., one-factor versus three-factor)
Acknowledge potentially important variables that were omitted from the model or not measured. How might these alter the conclusions?
Report and evaluate lower-order model components
Test components of the model to verify fit in each (this can proceed hierarchically)
Design studies with sensitivity to detect non-trivial misspecifications and include measures to flag trivial misspecifications (I think this is quite hard in practice)
Clearly distinguish between a priori models and post hoc modifications, or models generated from a search process.

3 Equivalent models

Equivalent models have the same degrees of freedom but specify different relationships among variables. Furthermore, equivalent models fit the data equally well, providing identical fit statistics and sample likelihood. These can pose a problem in SEM when our core conclusions rest on specifying the directionality of a relationship if changing the cause and effect would result in equivalent fit.

In general, directed (causal) relations cannot be inferred from covariance data without strong assumptions about the processes that generated the data. For example, if one variable causes another, the cause must precede the effect in time. Likewise, causal relationships are more plausible if one variable is manipulated experimentally and its effects on dependent variables are measured. Thus, the problem of equivalent models largely points to a limitation of relying on observational techniques and correlational measures.

Observational equivalence: one model generates every probability distribution (univariate or multivariate) generated by an alternative model. In other words, the models cannot be distinguished because the data they generate is identical.
Covariance equivalence: every covariance matrix possible under one model could be equally generated by another. This leads to the same residual structure and conditional independence relationships.

3.1 Lee-Hershberger replacing rules for structural models

Within a block of variables at the beginning of a structural model that is just identified and with unidirectional relations to subsequent variables, direct effects, correlated disturbances, and equality-constrained reciprocal effects are interchangeable. This means we can replace \(Y1 \rightarrow Y2\) with \(Y1 \leftarrow Y2\), \(Y2 \leftrightharpoons Y1\), or \(D1 \leftrightarrow D2\).
At subsequent places in the model where two endogenous variables have the same causes and their relations are unidirectional, all of the following may be substituted for one another: \(Y1 \rightarrow Y2\) with \(Y1 \leftarrow Y2\), \(Y2 \leftrightharpoons Y1\) (equality-constrained), or \(D1 \leftrightarrow D2\).

Limitations of replacing rules:

They may not be transitive: reapplying replacing rules from previous applications may not generate additional versions of the model that are all equivalent.
The replacing rules can generate a new structural model whose conditional independence relationships are not identical (a problem with d-separation, as we’ll discuss vis-a-vis causality).

3.2 Examples of equivalent models

From Mulaik (2009)

\(Y1\) causes \(Y2\)

\(Y2\) causes \(Y1\)

Note that this model does not include any causal path between \(Y1\) and \(Y2\), which seems like a conceptually distinct position from Models A and B.

3.3 What to do?

3.3.1 After the data are collected

From Hershberger 2006

Use an index for judging model quality that is not based on global goodness-of-fit or likelihood. For example the information complexity criterion (ICOMP) approach tries to quantify model misfit (residual covariances) relative to its complexity (correlations among parameter estimates, where greater correlation indicates complexity). See semGOF package. Among equivalent models, a model with lower ICOMP would be preferred.
Eliminate equivalent models that have problems during estimation such as many iterations, negative variance estiamtes, or parameters that have to be constrained to avoid improper solutions. This is a heuristic approach.
Combine \(R^2\) measures for endogenous variables into a single measure (e.g., mean) and prefer the model with the highest value. This is also heuristic and relatively untested.
Examine extended individual case residuals (EICR). Although the residual covariance matrix is identical among equivalent models, the residuals in terms of predicted versus observed scores on manifest variables are not. Thus, according to Raykov and Penev (2001), a model that minimizes individual case residuals should be preferred over equivalents.

3.3.2 Before the data are collected

Try to come up with situations where reversing some paths, even if ancillary, is implausible. Recall that the L-H replacing rules only yield equivalent models when two endogenous variables have the same causes.

Consider this example:

m <- '
Happiness ~ 0.3*Money + 0.3*Love
Travel ~ 0.4*Happiness + 0.3*Money
'

d <- simulateData(m, sample.nobs=500)

#groundtruth
f1.syn <- '
Happiness ~ Money + Love
Travel ~ Happiness + Money
'
f1 <- sem(f1.syn, d)
semPaths_default(f1)

fitmeasures(f1)["chisq"]

## chisq 
##  0.87

is not equivalent to this:

#not equivalent
f2.syn <- '
Happiness ~ Money + Love + Travel
Travel ~ Money
'
f2 <- sem(f2.syn, d)
semPaths_default(f2)

fitmeasures(f2)["chisq"]

## chisq 
##  4.89

Why not? Love is not a cause of Travel in either model, so the causes of the endogenous variables are not the same (does not satisfy rule 2).

This, on the other hand, would be equivalent according to rule 1. The relations among exogenous variables are just identified and have unidirectional effects on other variables. Thus, we can shift the \(Money \leftrightarrow Love\) relationship from an unanalyzed association to a directed effect.

#equivalent
f3.syn <- '
Love ~ Money
Happiness ~ Money + Love
Travel ~ Happiness + Money
'
f3 <- sem(f3.syn, d, conditional.x=FALSE)
semPaths_default(f3)

fitmeasures(f3)["chisq"]

## chisq 
##  0.87

#equivalent
f4.syn <- '
Happiness ~ Love
Travel ~ Happiness + Money
Money ~ Happiness
'
f4 <- sem(f4.syn, d, conditional.x=FALSE)
semPaths_default(f4)

fitmeasures(f4)["chisq"]

## chisq 
##  3.53

The closer the model is to saturated, the larger the number of equivalent models. When possible, try to test a relatively sparse structural model. More specifically, in saturated blocks, there is little information to adjudicate directionality. Thus, it is useful to include structural variables that are thought to be unrelated to others.
Models with many paths suffer from many equivalents. Can you test your key hypotheses in models with fewer variables?
Consider including ancillary variables that are not related to the original set in the model. In particular, consider adding antecedents (predictors) of variables in a source block such that they do not have any relationship to the focal (destination) block:

4 Are two models nested, non-nested, or equivalent?

Bentler and Satorra (2010) defined a useful nested model test:

Conduct typical SEM on Model \(M1\). Save the \(df_{M1}\) and model-implied covariance \(\hat{\boldsymbol{\Sigma}}(\Theta)\).
Conduct a SEM on an alternative model \(M2\) using the \(\hat{\boldsymbol{\Sigma}}(\Theta)\) as data (not raw). Save the \(\chi^2_{M2}\) and \(df_{M2}\).
Compute \(d = df_{M1} - df_{M2}\) and set a small criterion, \(\epsilon\), on \(\chi^2\) for equivalence (e.g., .001)
Test for relationship between models:
- If \(d > 0\) and \(\chi^2_{M2} < \epsilon\), the models are nested.
- If \(d = 0\) and \(\chi^2_{M2} < \epsilon\), the models are equivalent.
- If \(d < 0\) or \(\chi^2_{M2} \geq \epsilon\), the models are not nested.

The primary idea is that when there are more degrees of freedom in \(M1\) than \(M2\) and the model-implied covariance matrix can be reproduced by \(M2\), then \(M1\) is nested in \(M2\).

5 Model evidence, K-L distance

“All models are wrong but some are useful” - George Box

The goal of most statistical modeling is to approximate the key factors that generated our empirical data. The data generation processes are usually complex and largely unknown, so our models will almost always be deficient in many ways. Nevertheless, we may wish to know how close one model is to the unknown truth relative to a comparator model. This is the substance of model selection theory – i.e., how one selects among models based on relative evidence.

Kullback-Leibler information quantifies the information lost when lost when a simple model, g is used to approximate the true model, f:

\[ I(f,g) = \int{f(x) \textrm{log} \left( \frac{f(x)}{g(x|\theta)} \right) dx} \] This can also be thought of as the distance between our model, g and the population (unknown) model, f.

In the simplest case, we can think of trying to identify a probability distribution (e.g., Gaussian) that best describes a given empirical sample. Recall from the M-L lecture our attempt to fit a Gaussian distribution using a Gamma distribution. The sample likelihood for the Gamma, even at optimized (fitted) parameters, can never be as close as a Gaussian fit.

ML Gamma fit

X <- rnorm(5000, mean=50, sd=sqrt(50))
gfit <- suppressWarnings(fitdistr(X, "gamma"))
gfit$loglik

## [1] -17011

datafromgam <- rgamma(n=5000, shape=gfit$estimate["shape"], rate=gfit$estimate["rate"])

ML Gaussian fit

gfit <- suppressWarnings(fitdistr(X, "normal"))
gfit$loglik

## [1] -16973

datafromgauss <- rnorm(n=5000, mean=gfit$estimate["mean"], sd=gfit$estimate["sd"])

df <- data.frame(model=rep(c("groundtruth", "gamma", "gauss"), each=5000), 
                           v=c(X, datafromgam, datafromgauss))
                 
ggplot(df, aes(x=v, color=model)) + geom_density()

The sample LL is an approximation of a K-L distance between a model and ‘truth’, but read on to AIC below…

In general, we do not know the true model, which means we cannot compute the exact K-L distance. Fortunately, however, we are often interested in relative distance – that is, how much better is model A than model B? In this circumstance, the K-L distance derivation includes two terms: one is based on the expectation under the ‘true’ model (can’t compute that!), the other is based on the relative distance between the model and the truth.

Since we can’t compute the first term, it means we do not have a measure of absolute fit using likelihood-based and K-L methods. But, because we can compute the relative distance, we can also compute the relative distance between two or more models. As a result, we can compare relative evidence across a set of candidate models. All may be bad, or all may be good, by some absolute criterion, but we can ascertain which is relatively best. Global fit criteria in SEM often give some information about the absolute fit (esp. residual-based methods). Likelihood methods (K-L distance) are more about relative evidence.

K-L distances are on a true ratio scale (i.e., a meaningful zero point and equal interval spacing). As a result, differences in distances can be compared, and a zero K-L difference implies identical relative fit. Moreover, K-L distances tend to become larger as the sample size increases. By analogy, AIC differences among models tend to be larger in large samples compared to small ones. This does not mean that the scaling or interpretation changes! Rather, it underscores that discriminating among models becomes easier in large samples.

5.1 Akaike Information Criterion (AIC)

The derivation of the AIC was a major breakthrough in information theory and statistics because it provides a statistical link between K-L distance (a more theoretical notion) and maximum likelihood-based methods.

The AIC provides an approximation of the relative distance of the fitted model and the unknown (true) mechanisms that generated the data. Importantly, it is a large sample criterion (based on asymptotic properties, so 1000s, ideally) and is also not especially meaningful if the models in question are simply inappropriate or poor (e.g., fitting a binary outcome with a continuous model).

\[ AIC = -2 \textrm{log} (\mathcal{L}(\hat{\theta}|y)) + 2k \] where \(k\) is the number of free parameters, and \(\mathcal{L}(\hat{\theta}|y)\) is the sample likelihood.

Note that by multiplying the LL by -2, the criterion is now scaled such that lower values are better. By contrast, for LL, higher values indicate better fit (greater likelihood of the data given the model).

5.2 Corrected AIC

Because AIC is based on a large sample derivation (i.e., its statistical properties are ideal as the sample size goes to infinity), it may select overly complex models in smaller samples (think hundreds, not thousands). This led to the derivation of a variant of AIC, usually denoted \(AIC_C\) (Sugiura, 1978), that includes an additional penalty for sample size:

\[ AIC_C = -2 \textrm{log} (\mathcal{L}(\hat{\theta}|y)) + 2k + \frac{2k(k+1)}{n - k -1} \] When your cases:parameters ratio is below 40 (which it usually is!), Burnham and Anderson recommend this over AIC. But this was derived for univariate (one DV) regression, not SEM. There is not an equivalent in SEM (that I know of), but also not a specific reason that the above derivation is not approximately correct.

6 Comparing relative evidence of candidate SEMs

One of the most important considerations when comparing non-nested models is that the set of endogenous variables is the same and on the same scale! This is often misunderstood in applied papers and can lead to bizarre circumstances where AICs differ by thousands of points. That’s not impossible, but more likely to reflect differences in the likelihood function that render models non-comparable using LL-based methods (incl. AIC/BIC).

Recall that the sample likelihood is computed on the endogenous variables. Thus, if the endogenous set changes (e.g., changing the direction of an arrow), LL methods cannot necessarily be compared. In lavaan, this relates to the conditional.x option. See ?lavOptions. If it is FALSE, then model-implied statistics, including LL, reflect both exogenous and endogenous variables. This is likely what you’d want in cases where you’re testing alternative models with paths direction changes that would make an endogenous variable exogenous, or vice versa.

If you’re not sure, check the unrestricted (saturated) LL for the two candidate models. If they are not identical, you cannot use AIC/BIC to compare them. You might also check your setting of conditional.x in lavaan. Fortunately, lavaan will usually provide a warning: “WARNING: some models are based on a different set of observed variables.” If you see that, watch out! Your models are almost definitely not comparable, either in terms of LRT (nested) or AIC/BIC (non-nested) comparisons since both are based on the sample likelihood.

6.1 Comparing non-nested models

See this paper from Ed Merkle: https://arxiv.org/pdf/1402.6720.pdf. And the corresponding nonnest2 package in R.

Briefly, the Vuong test is often a useful way to compare evidence among models, including non-nested models. The package plays well with lavaan.

(Some quotes from Hallquist & Pilkonis, 2010)

6.2 Efficiency (AIC)

AIC is an asymptotically efficient criterion for model selection, which means that it tends to select the model that minimizes prediction error as sample size increases. More generally, AIC is an information criterion primarily concerned with minimizing the relative Kullback-Leibler (K-L) distance between a statistical model and the unknown mechanisms giving rise to the observed data (Claeskens & Hjort, 2008).

An underlying assumption of AIC is that the “truth” of the data is very complex (i.e., many causal factors of varying strengths) and that no one “true model” exists (Burnham & Anderson, 2002). Thus, AIC is useful for selecting the best approximate model using a finite number of parameters when “truth” is countably infinite.

6.3 Consistency (BIC)

By contrast, the BIC originates from the Bayesian tradition in statistics and is concerned with the statistical property of consistency, which refers to the one “true model” being selected with increasing probability as sample size increases (Yang, 2005). Underlying the property of consistency are several assumptions worth examining: (1) a single “true” model exists that best represents the mechanisms that gave rise to the observed data, (2) one of the models in the set of models tested is the “true” model, and (3) the focus of model selection is to identify the “true” model rather than to provide a close approximation to the truth. Given these assumptions, Burnham and Anderson (2002) have argued against the use of consistent model selection criteria in the biological, medical, and social sciences because such data are highly complex, there is a low likelihood of including the “true model” in any set of models considered, and it is unlikely that a single “true model” exists.

6.4 Comparing models based on AIC or BIC differences

Summarizing extensive simulation work, Burnham and Anderson (2002) offer useful rules-of-thumb for inferring the degree of certainty afforded to a particular model relative to the best fitting model in the set:

models within 0–2 points on the AIC also have substantial support
models in the 4–7 point range have considerably less support
models greater than 10 points apart have almost no support.

Moreover, AIC values for a set of models can be subtracted from the model with the lowest AIC value and these difference scores can be transformed into a set of Akaike weights that represent the evidential weight of a particular model relative to all other models in the set (Burnham & Anderson, 2002). The important general point is that conclusions drawn from latent variable models would benefit from being contextualized in terms of model selection uncertainty and the degree of support for one model over alternatives.

For details on Akaike weights or evidence ratios, I would recommend checking out the AICcmodavg package. See demo below.

7 Demo

From Brown Ch. 8, Fig 8.1. This is a measure of coping styles administered to 275 college students. There are four first-order latent variables, which reflect different ways of coping with stress: Problem Solving, Cognitive Restructuring, Express Emotions, and Social Support. The authors also anticipated that there are two higher-order factors that account for correlations among the first-order factors: Problem-Focused Coping and Emotion-Focused Coping.

7.1 First-order CFA

rawdat <- '
1.00
0.78  1.00
0.80  0.77  1.00
0.56  0.51  0.48  1.00
0.52  0.51  0.46  0.78  1.00
0.59  0.51  0.51  0.80  0.79  1.00
0.16  0.15  0.17  0.14  0.18  0.16  1.00
0.19  0.13  0.18  0.14  0.16  0.16  0.81  1.00
0.12  0.17  0.17  0.17  0.20  0.16  0.75  0.80  1.00
0.16  0.13  0.17  0.15  0.16  0.18  0.56  0.52  0.50  1.00
0.16  0.14  0.18  0.15  0.16  0.18  0.51  0.58  0.51  0.81  1.00
0.16  0.15  0.14  0.16  0.16  0.14  0.52  0.57  0.52  0.80  0.79  1.00
'

#standard deviations of variables
sds <- as.numeric(strsplit('1.40  2.10  1.30  2.30  2.40  1.90  2.00  2.90  2.30  3.10  2.20  1.20', '\\s+', perl=TRUE)[[1]])

cormat <- getCov(rawdat)

#this is the covariance matrix to be analyzed
covmat <- lavaan::cor2cov(cormat, sds)

rownames(covmat) <- colnames(covmat) <- c("P1", "P2", "P3", "C1", "C2", "C3", "E1", "E2", "E3", "S1", "S2", "S3")

#to get Vuong test to work as expected, need raw data, not just a covmat
df <- mvrnorm(n=275, mu=rep(0, 12), Sigma=covmat, empirical = TRUE)

bmodel <- '
ProbSol =~ P1 + P2 + P3
CogRest =~ C1 + C2 + C3
ExpEmo =~ E1 + E2 + E3
SocSupp =~ S1 + S2 + S3

#explicitly name factor correlations
ProbSol ~~ CogRest
ProbSol ~~ ExpEmo
ProbSol ~~ SocSupp
CogRest ~~ ExpEmo
CogRest ~~ SocSupp
ExpEmo ~~ SocSupp
'

#l1fac_m <- sem(bmodel, sample.cov=covmat, sample.nobs = 275)
l1fac_m <- sem(bmodel, df)
summary(l1fac_m, fit.measures=TRUE, standardized=TRUE)

## lavaan (0.5-23.1097) converged normally after  65 iterations
## 
##   Number of observations                           275
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               82.970
##   Degrees of freedom                                48
##   P-value (Chi-square)                           0.001
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             2761.514
##   Degrees of freedom                                66
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.987
##   Tucker-Lewis Index (TLI)                       0.982
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5642.801
##   Loglikelihood unrestricted model (H1)      -5601.316
## 
##   Number of free parameters                         30
##   Akaike (AIC)                               11345.602
##   Bayesian (BIC)                             11454.106
##   Sample-size adjusted Bayesian (BIC)        11358.981
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.051
##   90 Percent Confidence Interval          0.032  0.070
##   P-value RMSEA <= 0.05                          0.426
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.017
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   ProbSol =~                                                            
##     P1                1.000                               1.273    0.911
##     P2                1.424    0.071   20.173    0.000    1.812    0.864
##     P3                0.896    0.043   20.778    0.000    1.140    0.879
##   CogRest =~                                                            
##     C1                1.000                               2.038    0.888
##     C2                1.027    0.051   19.964    0.000    2.094    0.874
##     C3                0.842    0.040   21.117    0.000    1.715    0.904
##   ExpEmo =~                                                             
##     E1                1.000                               1.743    0.873
##     E2                1.542    0.072   21.395    0.000    2.688    0.929
##     E3                1.133    0.059   19.043    0.000    1.974    0.860
##   SocSupp =~                                                            
##     S1                1.000                               2.787    0.901
##     S2                0.706    0.033   21.679    0.000    1.969    0.896
##     S3                0.381    0.018   21.230    0.000    1.062    0.886
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   ProbSol ~~                                                            
##     CogRest           1.718    0.209    8.217    0.000    0.662    0.662
##     ExpEmo            0.453    0.149    3.042    0.002    0.204    0.204
##     SocSupp           0.695    0.237    2.928    0.003    0.196    0.196
##   CogRest ~~                                                            
##     ExpEmo            0.712    0.238    2.991    0.003    0.201    0.201
##     SocSupp           1.146    0.381    3.011    0.003    0.202    0.202
##   ExpEmo ~~                                                             
##     SocSupp           3.255    0.396    8.218    0.000    0.670    0.670
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .P1                0.333    0.051    6.526    0.000    0.333    0.170
##    .P2                1.111    0.130    8.560    0.000    1.111    0.253
##    .P3                0.384    0.048    8.043    0.000    0.384    0.228
##    .C1                1.118    0.143    7.812    0.000    1.118    0.212
##    .C2                1.356    0.162    8.351    0.000    1.356    0.236
##    .C3                0.654    0.093    7.010    0.000    0.654    0.182
##    .E1                0.946    0.112    8.425    0.000    0.946    0.237
##    .E2                1.155    0.205    5.638    0.000    1.155    0.138
##    .E3                1.372    0.155    8.862    0.000    1.372    0.260
##    .S1                1.805    0.242    7.448    0.000    1.805    0.189
##    .S2                0.947    0.124    7.655    0.000    0.947    0.196
##    .S3                0.308    0.038    8.103    0.000    0.308    0.215
##     ProbSol           1.620    0.169    9.558    0.000    1.000    1.000
##     CogRest           4.152    0.452    9.186    0.000    1.000    1.000
##     ExpEmo            3.039    0.339    8.958    0.000    1.000    1.000
##     SocSupp           7.770    0.823    9.435    0.000    1.000    1.000

semPaths_default(l1fac_m, sizeMan=7)

7.2 Second-order CFA

What about a model with a higher-order factor structure: Problem-Focused versus Emotion-Focused?

hcfa <- '
ProbSol =~ P1 + P2 + P3
CogRest =~ C1 + C2 + C3
ExpEmo =~ E1 + E2 + E3
SocSupp =~ S1 + S2 + S3

PFCope =~ ProbSol + CogRest
EFCope =~ ExpEmo + SocSupp
'

#hcfa_m <- sem(hcfa, sample.cov=covmat, sample.nobs = 275)
hcfa_m <- sem(hcfa, df)
summary(hcfa_m, fit.measures=TRUE, standardized=TRUE)

## lavaan (0.5-23.1097) converged normally after  63 iterations
## 
##   Number of observations                           275
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               83.005
##   Degrees of freedom                                49
##   P-value (Chi-square)                           0.002
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             2761.514
##   Degrees of freedom                                66
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.987
##   Tucker-Lewis Index (TLI)                       0.983
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5642.819
##   Loglikelihood unrestricted model (H1)      -5601.316
## 
##   Number of free parameters                         29
##   Akaike (AIC)                               11343.637
##   Bayesian (BIC)                             11448.524
##   Sample-size adjusted Bayesian (BIC)        11356.570
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.050
##   90 Percent Confidence Interval          0.031  0.068
##   P-value RMSEA <= 0.05                          0.469
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.017
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   ProbSol =~                                                            
##     P1                1.000                               1.273    0.911
##     P2                1.424    0.071   20.174    0.000    1.812    0.864
##     P3                0.896    0.043   20.777    0.000    1.140    0.879
##   CogRest =~                                                            
##     C1                1.000                               2.038    0.888
##     C2                1.028    0.051   19.966    0.000    2.094    0.874
##     C3                0.842    0.040   21.115    0.000    1.715    0.904
##   ExpEmo =~                                                             
##     E1                1.000                               1.743    0.873
##     E2                1.542    0.072   21.393    0.000    2.687    0.928
##     E3                1.133    0.059   19.047    0.000    1.975    0.860
##   SocSupp =~                                                            
##     S1                1.000                               2.787    0.901
##     S2                0.706    0.033   21.678    0.000    1.969    0.896
##     S3                0.381    0.018   21.230    0.000    1.062    0.886
##   PFCope =~                                                             
##     ProbSol           1.000                               0.812    0.812
##     CogRest           1.609    0.422    3.813    0.000    0.816    0.816
##   EFCope =~                                                             
##     ExpEmo            1.000                               0.826    0.826
##     SocSupp           1.572    0.407    3.865    0.000    0.812    0.812
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   PFCope ~~                                                             
##     EFCope            0.448    0.146    3.062    0.002    0.301    0.301
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .P1                0.333    0.051    6.525    0.000    0.333    0.170
##    .P2                1.111    0.130    8.560    0.000    1.111    0.253
##    .P3                0.384    0.048    8.043    0.000    0.384    0.228
##    .C1                1.119    0.143    7.812    0.000    1.119    0.212
##    .C2                1.355    0.162    8.350    0.000    1.355    0.236
##    .C3                0.655    0.093    7.012    0.000    0.655    0.182
##    .E1                0.946    0.112    8.423    0.000    0.946    0.237
##    .E2                1.157    0.205    5.644    0.000    1.157    0.138
##    .E3                1.372    0.155    8.859    0.000    1.372    0.260
##    .S1                1.806    0.242    7.448    0.000    1.806    0.189
##    .S2                0.947    0.124    7.655    0.000    0.947    0.196
##    .S3                0.308    0.038    8.102    0.000    0.308    0.215
##     ProbSol           0.552    0.279    1.982    0.047    0.341    0.341
##     CogRest           1.388    0.721    1.925    0.054    0.334    0.334
##     ExpEmo            0.968    0.530    1.826    0.068    0.319    0.319
##     SocSupp           2.652    1.314    2.018    0.044    0.341    0.341
##     PFCope            1.068    0.308    3.467    0.001    1.000    1.000
##     EFCope            2.071    0.593    3.491    0.000    1.000    1.000

semPaths_default(hcfa_m, sizeMan=7)

7.3 Model comparison

Are these models nested?

anova(l1fac_m, hcfa_m)

	Df	AIC	BIC	Chisq	Chisq diff	Df diff	Pr(>Chisq)
l1fac_m	48	11346	11454	83	NA	NA	NA
hcfa_m	49	11344	11449	83	0.035	1	0.852

nonnest2::vuongtest(l1fac_m, hcfa_m)

## 
## Model 1 
##  Class: lavaan 
##  Call: lavaan::lavaan(model = bmodel, data = df, model.type = "sem", ...
## 
## Model 2 
##  Class: lavaan 
##  Call: lavaan::lavaan(model = hcfa, data = df, model.type = "sem", int.ov.free = TRUE, ...
## 
## Variance test 
##   H0: Model 1 and Model 2 are indistinguishable 
##   H1: Model 1 and Model 2 are distinguishable 
##     w2 = 0.000,   p = 0.854
## 
## Non-nested likelihood ratio test 
##   H0: Model fits are equal for the focal population 
##   H1A: Model 1 fits better than Model 2 
##     z = 0.094,   p = 0.462
##   H1B: Model 2 fits better than Model 1 
##     z = 0.094,   p = 0.5375

aa <- aictab(list(l1fac_m, hcfa_m))

## Warning in aictab.AIClavaan(list(l1fac_m, hcfa_m)): 
## Model names have been supplied automatically in the table

7.4 Bentler NET approach

We can test whether models are nested, equivalent, or not nested using Bentler’s nesting and equivalent testing procedure. As mentioned above, this involves fitting model 1 using a candidate SEM, saving the model-implied covariance matrix, then fitting model 2 using that matrix as the input. We examine the difference in degrees of freedom. If \(df_1 - df_2 > 0\) and the \(\chi^2\) of model 2 is essentially zero (usually < .001) then we say model 1 is nested in model 2. Note that this is an asymmetric test. Model 1 may not be nested in model 2, but the converse (2 nested in 1) could be.

7.5 Is saturated model nested in hierarchical model?

Following Bentler steps:

df1 <- fitmeasures(l1fac_m)["df"] #degrees of freedom for saturated factor model (no higher-order structure)
mimplied_l1fac <- fitted(l1fac_m)$cov #model implied cov

#now use the model-implied covariance as the data to analyze for hcfa_m
hcfa_m_fromcov <- sem(hcfa, sample.cov=mimplied_l1fac, sample.nobs = 275)
df2 <- fitmeasures(hcfa_m_fromcov)["df"]
d <- df1 - df2
cat("df for NET: ", d, "\n")

## df for NET:  -1

chi2 <- fitmeasures(hcfa_m_fromcov)["chisq"]
cat("Chi-square for NET test: ", chi2, "\n")

## Chi-square for NET test:  0.0349

We see that the degrees of freedom difference is negative. Thus, the saturated model is not nested in the hierarchical model. This makes sense intuitively in that the hierarhical model imposes some structure on the factor model, whereas the saturated model estimates unique covariances among factors.

7.6 Is the hierarchical model nested in the saturated model?

We can reverse the NET test (in terms of what is \(M1\) versus \(M2\)). To test whether the hierarchical CFA is nested in the saturated model, we first fit the saturated CFA, then test the hierarchical model using the model-implied covariance matrix from the saturated model.

#is the H CFA nested in the saturated (first-order)
df1 <- fitmeasures(hcfa_m)["df"]
mimplied_hcfa <- fitted(hcfa_m)$cov

#now use the model-implied covariance as the data to analyze for hcfa_m

hcfa_m_fromcov <- sem(bmodel, sample.cov=mimplied_hcfa, sample.nobs = 275)
df2 <- fitmeasures(hcfa_m_fromcov)["df"]
d <- df1 - df2
cat("df for NET: ", d, "\n")

## df for NET:  1

chi2 <- fitmeasures(hcfa_m_fromcov)["chisq"]
cat("Chi-square for NET test: ", chi2, "\n")

## Chi-square for NET test:  0

Here, we see the expected result for a nested model: the df difference is positive (1) and the \(\chi^2\) of the second model (using the model-implied covariance of the first) is 0. Thus, the hierarchical CFA model is nested in the saturated first-order factor model.

7.7 Conducting NET the easy way

The semTools package provides a net() function that will compute nesting/equivalence tests for all pairs of models passed in (as separate arguments). This package also provides the moreFitIndices function, which provides the \(AIC_C\) as aic.smallN.

#the easy way
semTools::net(l1fac_m, hcfa_m)

## 
##      If cell [R, C] is TRUE, the model in row R is nested within column C.
## 
##      If cell [R, C] is TRUE and the models have the same degrees of freedom,
##      they are equivalent models.  See Bentler & Satorra (2010) for details.
## 
##      If cell [R, C] is NA, then the model in column C did not converge when
##      fit to the implied means and covariance matrix from the model in row R.
## 
##      The hidden diagonal is TRUE because any model is equivalent to itself.
##      The upper triangle is hidden because for models with the same degrees
##      of freedom, cell [C, R] == cell [R, C].  For all models with different
##      degrees of freedom, the upper diagonal is all FALSE because models with
##      fewer degrees of freedom (i.e., more parameters) cannot be nested
##      within models with more degrees of freedom (i.e., fewer parameters).
##      
##                   l1fac_m hcfa_m
## l1fac_m (df = 48)               
## hcfa_m (df = 49)  TRUE

semTools::moreFitIndices(l1fac_m)

##       gammaHat    adjGammaHat baseline.rmsea     aic.smallN     bic.priorN 
##       9.79e-01       9.66e-01       3.85e-01       1.13e+04       1.15e+04 
##            hqc            sic 
##       1.14e+04       1.14e+04

Model comparison and evidence

PSY 597 Week 9

Michael Hallquist

19 Oct 2017