This tutorial is meant to walkthrough how Ωnyx allows users to walk through conceptual path models (visual) and convert them on the fly into lavaan code for analysis.

Download Instructions:

  • Ωnyx is freely available for download here.

  • You will need Java Runtime Environment installed in order to run Ωnyx.

  • Ωnyx is not installed but is simply opened using Java, so just double click on the .jar file to open the program.

The Basics:

  • Ωnyx is almost entirely graphically based and relies on point-and-click methods. Start by opening some data. Open the Ωnyx menu at the top left and navigate to the Simple Regression Data to begin:
  • You’ll see a hexagon appear. This represents a dataset and includes 3 vectors: ID, X_c, and Y_c, the latter of which are two centered continuous variables:
  • Double-click in the blank space or right(ctrl) click and select “Create Empty Model” to create a space for your path diagram:
  • By right(ctrl) clicking in the model space, you will open a window from which you can build your model. Try creating two observed variables for your X and Y vectors from the dataset you opened earlier:
  • Connect these variables with a regression line by right(ctrl) clicking and dragging the arrow that appears from one box to the other:
  • You’ve now created the visual representation of your first regression model! Now, add the data to the appropriate boxes by dragging the variables from the data hexagon to the boxes. You’ll notice that the variable names change in the boxes and actual variance estimates are now generated, based on the raw data:
  • Finally, allow the regression path to be freely estimated by right(ctrl) clicking on the regression line and selecting “Free Parameter”:
  • You’ll notice there is now an unstandardized b weight being estimated for the regression of Y_c onto X_c. You can also add the standardized weights by selecting “Show Standardized Estimates” under the “Customize Path” dialog:
  • We can also clean up the variable names in the boxes or for the paths/variances if we want, change the colors, etc. all in the right(ctrl) click dialog box.
  • One final helpful trick is to pull the underlying code (e.g., lavaan code) and/or the covariance matrix from the model you just developed. Notice Ωnyx generates lavaan code that is used with the lavaan() function (rather than the cfa() function), which requires specifying of all relevant parameters (e.g., variances) and is more useful for learning what’s going on with your code:

*we can run the exact same analysis using the simple linear model syntax from the lm() function. Start with loading data from the example:

simple.dat <- read.csv("simple_regression_data.csv", header = TRUE, sep = "\t")
str(simple.dat)
## 'data.frame':    10 obs. of  3 variables:
##  $ ID : num  1 2 3 4 5 6 7 8 9 10
##  $ X_c: num  2.461 -1.509 -2.699 -0.689 3.431 ...
##  $ Y_c: num  6.87 5.43 3.29 -3.92 -1.68 ...
simple.lm <- lm(Y_c~X_c, data = simple.dat)
summ(simple.lm)
Observations 10
Dependent variable Y_c
Type OLS linear regression
F(1,8) 0.69
0.08
Adj. R² -0.04
Est. S.E. t val. p
(Intercept) -0.01 1.46 -0.01 1.00
X_c 0.54 0.64 0.83 0.43
Standard errors: OLS
summary(simple.lm)
## 
## Call:
## lm(formula = Y_c ~ X_c, data = simple.dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.0232 -3.5362 -0.3492  3.8309  6.2536 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -0.0092     1.4637  -0.006    0.995
## X_c           0.5364     0.6434   0.834    0.429
## 
## Residual standard error: 4.629 on 8 degrees of freedom
## Multiple R-squared:  0.07992,    Adjusted R-squared:  -0.03509 
## F-statistic: 0.6949 on 1 and 8 DF,  p-value: 0.4287

Now, let’s load the model we generated in Ωnyx (named differently from an earlier iteration, but the idea is the same):

source("single_predictor_data.R")
## lavaan 0.6-3 ended normally after 16 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          5
## 
##   Number of observations                            10
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic                0.833
##   Degrees of freedom                                 1
##   P-value                                        0.361
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)                -50.804
##   Loglikelihood unrestricted model (H1)        -50.804
## 
##   Number of free parameters                          5
##   Akaike (AIC)                                 111.609
##   Bayesian (BIC)                               113.122
##   Sample-size adjusted Bayesian (BIC)           98.143
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent Confidence Interval          0.000  0.000
##   P-value RMSEA <= 0.05                             NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   Child_IQ ~                                          
##     Prnt_IQ (P_IQ)    0.536    0.575    0.932    0.351
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     Parent_IQ         0.000    0.719    0.000    1.000
##    .Child_IQ         -0.009    1.309   -0.007    0.994
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     Prn_IQ (VAR_P)    5.175    2.314    2.236    0.025
##    .Chl_IQ (VAR_C)   17.139    7.665    2.236    0.025
###contains the lavaan script:
#  model<-"
# ! regressions 
#    Child_IQ ~ Parent_IQ__Child_IQ*Parent_IQ
# ! residuals, variances and covariances
#    Parent_IQ ~~ VAR_Parent_IQ*Parent_IQ
#    Child_IQ ~~ VAR_Child_IQ*Child_IQ
# ! observed means
#    Parent_IQ~1;
#    Child_IQ~1;
# ";

Now, let’s expand to multiple regression:

In Ωnyx:

data(mtcars)
# write.csv(mtcars, file = "mtcars.csv")

lm(mpg~hp+wt+gear,mtcars) %>% summ()
Observations 32
Dependent variable mpg
Type OLS linear regression
F(3,28) 47.31
0.84
Adj. R² 0.82
Est. S.E. t val. p
(Intercept) 32.01 4.63 6.91 0.00 ***
hp -0.04 0.01 -3.72 0.00 ***
wt -3.20 0.85 -3.78 0.00 ***
gear 1.02 0.85 1.20 0.24
Standard errors: OLS
lm(mpg~hp+wt+gear,mtcars) %>% vcov()
##             (Intercept)            hp           wt         gear
## (Intercept) 21.45787056  1.835745e-02 -3.195379735 -3.705293688
## hp           0.01835745  9.784101e-05 -0.006083359 -0.003562798
## wt          -3.19537974 -6.083359e-03  0.716639910  0.483287520
## gear        -3.70529369 -3.562798e-03  0.483287520  0.724896236
#
# This model specification was automatically generated by Onyx
#

 model<-"
! regressions 
   mpg ~ gear__mpg*gear
   mpg ~ wt__mpg*wt
   mpg ~ hp__mpg*hp
! residuals, variances and covariances
   mpg ~~ VAR_mpg*mpg
   hp ~~ VAR_hp*hp
   wt ~~ VAR_wt*wt
   gear ~~ VAR_gear*gear
   wt ~~ COV_wt_hp*hp
   gear ~~ COV_gear_wt*wt
   gear ~~ COV_gear_hp*hp
! observed means
   mpg~1;
   hp~1;
   wt~1;
   gear~1;
";


result<-lavaan(model, data=mtcars, fixed.x=FALSE, missing="FIML");
## Warning in lav_data_full(data = data, group = group, cluster = cluster, :
## lavaan WARNING: some observed variances are (at least) a factor 1000 times
## larger than others; use varTable(fit) to investigate
summary(result, fit.measures=TRUE);
## lavaan 0.6-3 ended normally after 68 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         14
## 
##   Number of observations                            32
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic               95.531
##   Degrees of freedom                                 6
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -314.167
##   Loglikelihood unrestricted model (H1)       -314.167
## 
##   Number of free parameters                         14
##   Akaike (AIC)                                 656.335
##   Bayesian (BIC)                               676.855
##   Sample-size adjusted Bayesian (BIC)          633.211
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent Confidence Interval          0.000  0.000
##   P-value RMSEA <= 0.05                             NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   mpg ~                                               
##     gear    (gr__)    1.020    0.796    1.281    0.200
##     wt      (wt__)   -3.198    0.792   -4.038    0.000
##     hp      (hp__)   -0.037    0.009   -3.976    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   wt ~~                                               
##     hp    (COV_w_)   42.812   13.757    3.112    0.002
##   gear ~~                                             
##     wt  (COV_gr_w)   -0.408    0.143   -2.850    0.004
##     hp  (COV_gr_h)   -6.160    8.731   -0.706    0.480
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .mpg              32.014    4.333    7.388    0.000
##     hp              146.687   11.929   12.296    0.000
##     wt                3.217    0.170   18.898    0.000
##     gear              3.687    0.128   28.725    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .mpg    (VAR_m)    5.798    1.450    4.000    0.000
##     hp     (VAR_h) 4553.965 1138.491    4.000    0.000
##     wt     (VAR_w)    0.927    0.232    4.000    0.000
##     gear   (VAR_g)    0.527    0.132    4.000    0.000
inspect(result, "list")
id lhs op rhs user block group free ustart exo label plabel start est se
1 mpg ~ gear 1 1 1 1 NA 0 gear__mpg .p1. 0.0000000 1.0199809 0.7964196
2 mpg ~ wt 1 1 1 2 NA 0 wt__mpg .p2. 0.0000000 -3.1978106 0.7918712
3 mpg ~ hp 1 1 1 3 NA 0 hp__mpg .p3. 0.0000000 -0.0367861 0.0092526
4 mpg ~~ mpg 1 1 1 4 NA 0 VAR_mpg .p4. 17.5944873 5.7980536 1.4495134
5 hp ~~ hp 1 1 1 5 NA 0 VAR_hp .p5. 2276.9824219 4553.9648427 1138.4909314
6 wt ~~ wt 1 1 1 6 NA 0 VAR_wt .p6. 0.4637304 0.9274609 0.2318652
7 gear ~~ gear 1 1 1 7 NA 0 VAR_gear .p7. 0.2636719 0.5273438 0.1318359
8 wt ~~ hp 1 1 1 8 NA 0 COV_wt_hp .p8. 0.0000000 42.8116405 13.7573398
9 gear ~~ wt 1 1 1 9 NA 0 COV_gear_wt .p9. 0.0000000 -0.4079219 0.1431226
10 gear ~~ hp 1 1 1 10 NA 0 COV_gear_hp .p10. 0.0000000 -6.1601563 8.7311447
11 mpg ~1 1 1 1 11 NA 0 .p11. 20.0906250 32.0136568 4.3330863
12 hp ~1 1 1 1 12 NA 0 .p12. 146.6875000 146.6875000 11.9294343
13 wt ~1 1 1 1 13 NA 0 .p13. 3.2172500 3.2172500 0.1702444
14 gear ~1 1 1 1 14 NA 0 .p14. 3.6875000 3.6875000 0.1283725
  • Running the same model with much less input gives us the same result:
model2<-'
mpg~hp+wt+gear
'
result2<-cfa(model2,data=mtcars, fixed.x = F,missing="FIML")
## Warning in lav_data_full(data = data, group = group, cluster = cluster, :
## lavaan WARNING: some observed variances are (at least) a factor 1000 times
## larger than others; use varTable(fit) to investigate
summary(result2,fit.measures=T)
## lavaan 0.6-3 ended normally after 38 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         14
## 
##   Number of observations                            32
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       0.000
##   Degrees of freedom                                 0
##   Minimum Function Value               0.0000000000000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic               57.703
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -314.167
##   Loglikelihood unrestricted model (H1)       -314.167
## 
##   Number of free parameters                         14
##   Akaike (AIC)                                 656.335
##   Bayesian (BIC)                               676.855
##   Sample-size adjusted Bayesian (BIC)          633.211
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent Confidence Interval          0.000  0.000
##   P-value RMSEA <= 0.05                             NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   mpg ~                                               
##     hp               -0.037    0.009   -3.976    0.000
##     wt               -3.198    0.792   -4.038    0.000
##     gear              1.020    0.796    1.281    0.200
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   hp ~~                                               
##     wt               42.812   13.757    3.112    0.002
##     gear             -6.160    8.731   -0.706    0.480
##   wt ~~                                               
##     gear             -0.408    0.143   -2.850    0.004
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .mpg              32.014    4.333    7.388    0.000
##     hp              146.688   11.929   12.296    0.000
##     wt                3.217    0.170   18.898    0.000
##     gear              3.687    0.128   28.725    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .mpg               5.798    1.450    4.000    0.000
##     hp             4553.965 1138.491    4.000    0.000
##     wt                0.927    0.232    4.000    0.000
##     gear              0.527    0.132    4.000    0.000
inspect(result2,"list")
id lhs op rhs user block group free ustart exo label plabel start est se
1 mpg ~ hp 1 1 1 1 NA 0 .p1. 0.0000000 -0.0367861 0.0092526
2 mpg ~ wt 1 1 1 2 NA 0 .p2. 0.0000000 -3.1978106 0.7918711
3 mpg ~ gear 1 1 1 3 NA 0 .p3. 0.0000000 1.0199808 0.7964196
4 mpg ~~ mpg 0 1 1 4 NA 0 .p4. 17.5944873 5.7980533 1.4495133
5 hp ~~ hp 0 1 1 5 NA 0 .p5. 4411.6534424 4553.9648273 1138.4909237
6 hp ~~ wt 0 1 1 6 NA 0 .p6. 41.4737769 42.8116415 13.7573403
7 hp ~~ gear 0 1 1 7 NA 0 .p7. -5.9676514 -6.1601565 8.7311447
8 wt ~~ wt 0 1 1 8 NA 0 .p8. 0.8984777 0.9274609 0.2318652
9 wt ~~ gear 0 1 1 9 NA 0 .p9. -0.3951743 -0.4079219 0.1431227
10 gear ~~ gear 0 1 1 10 NA 0 .p10. 0.5108643 0.5273438 0.1318359
11 mpg ~1 0 1 1 11 NA 0 .p11. 20.0906250 32.0136568 4.3330862
12 hp ~1 0 1 1 12 NA 0 .p12. 146.6875000 146.6875007 11.9294342
13 wt ~1 0 1 1 13 NA 0 .p13. 3.2172500 3.2172500 0.1702444
14 gear ~1 0 1 1 14 NA 0 .p14. 3.6875000 3.6875000 0.1283725

One step more complicated, a simple single-factor CFA:

cfa_dat<-read.csv("CFA_data.csv",head=T)
#
# This model specification was automatically generated by Onyx:
 model<-"
! regressions 
   x1=~1.0*X1   ##fixing loadings makes fit worse but more on this later
   x1=~1.0*X2
   x1=~1.0*X3
! residuals, variances and covariances
   X1 ~~ VAR_X1*X1
   X2 ~~ VAR_X2*X2
   X3 ~~ VAR_X3*X3
   x1 ~~ VAR_x1*x1
! observed means
   X1~1;
   X2~1;
   X3~1;
";
result<-lavaan(model, data=cfa_dat, fixed.x=FALSE, missing="FIML");
summary(result, fit.measures=TRUE);
## lavaan 0.6-3 ended normally after 19 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          7
## 
##   Number of observations                           100
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       4.651
##   Degrees of freedom                                 2
##   P-value (Chi-square)                           0.098
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic               83.880
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.967
##   Tucker-Lewis Index (TLI)                       0.951
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -481.673
##   Loglikelihood unrestricted model (H1)       -479.348
## 
##   Number of free parameters                          7
##   Akaike (AIC)                                 977.346
##   Bayesian (BIC)                               995.582
##   Sample-size adjusted Bayesian (BIC)          973.475
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.115
##   90 Percent Confidence Interval          0.000  0.256
##   P-value RMSEA <= 0.05                          0.155
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.076
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   x1 =~                                               
##     X1                1.000                           
##     X2                1.000                           
##     X3                1.000                           
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .X1               -0.067    0.153   -0.437    0.662
##    .X2                0.005    0.124    0.037    0.970
##    .X3               -0.178    0.139   -1.274    0.203
##     x1                0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .X1    (VAR_X1)    1.324    0.229    5.773    0.000
##    .X2    (VAR_X2)    0.518    0.129    4.012    0.000
##    .X3    (VAR_X3)    0.930    0.173    5.377    0.000
##     x1     (VAR_1)    1.011    0.185    5.460    0.000
 model_free<-"
! regressions 
   x1=~1.0*X1
   x1=~x1__X2*X2
   x1=~x1__X3*X3
! residuals, variances and covariances
   X1 ~~ VAR_X1*X1
   X2 ~~ VAR_X2*X2
   X3 ~~ VAR_X3*X3
   x1 ~~ VAR_x1*x1
! observed means
   X1~1;
   X2~1;
   X3~1;
";
result_free<-lavaan(model_free, data=cfa_dat, fixed.x=FALSE, missing="FIML");
summary(result, fit.measures=TRUE);
## lavaan 0.6-3 ended normally after 19 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                          7
## 
##   Number of observations                           100
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                       4.651
##   Degrees of freedom                                 2
##   P-value (Chi-square)                           0.098
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic               83.880
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.967
##   Tucker-Lewis Index (TLI)                       0.951
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -481.673
##   Loglikelihood unrestricted model (H1)       -479.348
## 
##   Number of free parameters                          7
##   Akaike (AIC)                                 977.346
##   Bayesian (BIC)                               995.582
##   Sample-size adjusted Bayesian (BIC)          973.475
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.115
##   90 Percent Confidence Interval          0.000  0.256
##   P-value RMSEA <= 0.05                          0.155
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.076
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   x1 =~                                               
##     X1                1.000                           
##     X2                1.000                           
##     X3                1.000                           
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .X1               -0.067    0.153   -0.437    0.662
##    .X2                0.005    0.124    0.037    0.970
##    .X3               -0.178    0.139   -1.274    0.203
##     x1                0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .X1    (VAR_X1)    1.324    0.229    5.773    0.000
##    .X2    (VAR_X2)    0.518    0.129    4.012    0.000
##    .X3    (VAR_X3)    0.930    0.173    5.377    0.000
##     x1     (VAR_1)    1.011    0.185    5.460    0.000
inspect(result_free, "std") #gives loadings (see standardized estimates on word doc)
## $lambda
##       x1
## X1 0.577
## X2 0.768
## X3 0.837
## 
## $theta
##    X1    X2    X3   
## X1 0.667            
## X2 0.000 0.411      
## X3 0.000 0.000 0.299
## 
## $psi
##    x1
## x1 1 
## 
## $nu
##    intrcp
## X1 -0.046
## X2  0.004
## X3 -0.120
## 
## $alpha
##    intrcp
## x1      0