Calculation of structural equation model with R lavaan and sem: Estimation of measurement model with all variables leads to warning message that covariance matrix of estimated parameters is not positive definite, smallest eigenvalue is < 0 (but very small, very close to 0):
I would like to calculate a complex structural equation model with several mediation and moderation effects (1 independent variable, 6 moderators, 5 mediators, 1 dependent variable, 4 control variables that affect the dependent variable).
However, my sample size is very small (N = 96).
I have already tried to estimate different models, e.g.:
• (1) a model with only a measurement model (without structural paths) and all items
• (2) a measurement model with item selection (my supervisor recommended excluding all items with loadings < .50)
• (3) a model that only contains direct structural paths in addition to the measurement models
• (4) a model in which I modeled mediator effects via three mediators and moderator effects via manifest interaction variables (due to high model complexity and low sample size, I decided against modeling with latent interactions)
However, I keep getting the following warning message:
„Warnmeldung: lavaan->lav_model_vcov(): The variance-covariance matrix of the estimated parameters (vcov) does not appear to be positive definite! The smallest eigenvalue (= -2.622768e-14) is smaller than zero. This may be a symptom that the model is not identified.“
This warning message appears even when I only include all variables in the measurement model (and do not model any structural paths), see model (1) above.
When I remove the target variables (which are mediators 1-3, one-dimensional latent constructs) and the evaluation variables (2 of my control variables, modeled hierarchically, with 2 second-order factors, each divided into 3 first-order factors) from the model, the warning message disappears. This only worked after I followed my supervisor's recommendation to prohibit all covariances between variables with “auto.cov.lv.x = FALSE”. That's why I initially suspected that these variables were the cause.
However, when I remove the poorly loading items from the measurement model, the model fit improves significantly, but the warning message reappears. The warning message ALWAYS appears in all other, more complex models.
I have already consulted ChatGPT, where I was presented with a wide range of possible causes, from multicollinearity at the latent level (which I was able to rule out as the cause), overparameterization, forced orthogonality in factors that are actually strongly correlated, excessive model complexity, and much more. That didn't get me any further.
My supervisor says I should definitely work on defining the model without the warning message and she believes that this is feasible.
Here is my code (model (1)):
# Modell eingeben
messmodell_the <- '
#############################
## Messmodelle
#############################
VM_Wahl_lat =~ AllgUFVS216 + AllgUFVS15 + AllgUFVS21
Recycling_lat =~ AllgUFVS23 + AllgUFVS11 + AllgUFVS12 + AllgUFVS13
Haushalt_lat =~ AllgUFVS14 + AllgUFVS16 + AllgUFVS24 + AllgUFVS25 + AllgUFVS26 + AllgUFVS28 + AllgUFVS29 + AllgUFVS211 + UFV_Lampen + UFV_Licht
Konsum_lat =~ AllgUFVS27 + AllgUFVS210 + AllgUFVS212 + AllgUFVS213 + AllgUFVS214
Ident_lat =~ Ident2 + Ident1 + Ident3
HedWerte_lat =~ Werte2 + Werte1 + Werte3
EgoWerte_lat =~ Werte5 + Werte4 + Werte6 + Werte7
AltWerte_lat =~ Werte10 + Werte8 + Werte9 + Werte11
BioWerte_lat =~ Werte15 + Werte12 + Werte13 + Werte14
Ziele1Hed_lat =~ Ziele1Hed12 + Ziele1Hed4 + Ziele1Hed7 + Ziele1Hed1 + Ziele1Hed2 + Ziele1Hed3 + Ziele1Hed5 + Ziele1Hed6 + Ziele1Hed8 + Ziele1Hed9 + Ziele1Hed10 + Ziele1Hed11 + Ziele1Hed13
Ziele1Gew_lat =~ Ziele1Gew6 + Ziele1Gew1 + Ziele1Gew2 + Ziele1Gew3 + Ziele1Gew4 + Ziele1Gew5 + Ziele1Gew7 + Ziele1Gew8 + Ziele1Gew9 + Ziele1Gew10
Ziele1Norm_lat =~ Ziele1Norm1 + Ziele1Norm9 + Ziele1Norm3 + Ziele1Norm2 + Ziele1Norm4 + Ziele1Norm5 + Ziele1Norm6 + Ziele1Norm7 + Ziele1Norm8
Ziele1MorLizens_lat =~ Ziele1MorLizens2 + Ziele1MorLizens1 + Ziele1MorLizens3
MorSKallgUFV_lat =~ MorSKallgUFV3 + MorSKallgUFV1 + MorSKallgUFV2 + MorSKallgUFV4
'
sgm_messmodell_the <- sem(
messmodell_the,
data = Datensatz_gesamt,
estimator = "MLR",
auto.cov.lv.x = FALSE
)
summary(sgm_messmodell_the, standardized = TRUE, fit.measures = TRUE)
Output:
Warnmeldung:
lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= -7.468159e-15)
is smaller than zero. This may be a symptom that the model is not
identified. >
> summary(sgm_messmodell_the, standardized = TRUE, fit.measures = TRUE)
lavaan 0.6-20 ended normally after 75 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 158
Number of observations 96
Model Test User Model:
Standard Scaled
Test Statistic 6092.972 6348.013
Degrees of freedom 3002 3002
P-value (Chi-square) 0.000 0.000
Scaling correction factor 0.960
Yuan-Bentler correction (Mplus variant)
Model Test Baseline Model:
Test statistic 8065.883 8292.565
Degrees of freedom 3081 3081
P-value 0.000 0.000
Scaling correction factor 0.973
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.380 0.358
Tucker-Lewis Index (TLI) 0.364 0.341
Robust Comparative Fit Index (CFI) 0.366
Robust Tucker-Lewis Index (TLI) 0.350
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -12586.381 -12586.381
Scaling correction factor 1.392
for the MLR correction
Loglikelihood unrestricted model (H1) -9539.895 -9539.895
Scaling correction factor 0.981
for the MLR correction
Akaike (AIC) 25488.762 25488.762
Bayesian (BIC) 25893.929 25893.929
Sample-size adjusted Bayesian (SABIC) 25395.054 25395.054
Root Mean Square Error of Approximation:
RMSEA 0.104 0.108
90 Percent confidence interval - lower 0.100 0.104
90 Percent confidence interval - upper 0.107 0.112
P-value H_0: RMSEA <= 0.050 0.000 0.000
P-value H_0: RMSEA >= 0.080 1.000 1.000
Robust RMSEA 0.106
90 Percent confidence interval - lower 0.102
90 Percent confidence interval - upper 0.109
P-value H_0: Robust RMSEA <= 0.050 0.000
P-value H_0: Robust RMSEA >= 0.080 1.000
Standardized Root Mean Square Residual:
SRMR 0.167 0.167
Parameter Estimates:
Standard errors Sandwich
Information bread Observed
Observed information based on Hessian
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
VM_Wahl_lat =~
AllgUFVS216 1.000 1.033 0.762
AllgUFVS15 1.658 0.412 4.025 0.000 1.712 0.719
AllgUFVS21 1.050 0.275 3.822 0.000 1.084 0.559
Recycling_lat =~
AllgUFVS23 1.000 1.064 0.723
AllgUFVS11 0.338 0.163 2.073 0.038 0.360 0.481
AllgUFVS12 0.818 0.215 3.801 0.000 0.870 0.748
AllgUFVS13 -0.251 0.134 -1.881 0.060 -0.267 -0.149
…
> sessionInfo()
R version 4.5.0 (2025-04-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8
[3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C
[5] LC_TIME=German_Germany.utf8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] measureQ_1.6.0 openxlsx_4.2.8.1 dplyr_1.1.4 matrixcalc_1.0-6
[5] semTools_0.5-7 lavaan_0.6-20 MASS_7.3-65
loaded via a namespace (and not attached):
[1] zip_2.3.3 vctrs_0.6.5 cli_3.6.5 rlang_1.1.6
[5] estimability_1.5.1 stringi_1.8.7 generics_0.1.4 xtable_1.8-4
[9] glue_1.8.0 pbivnorm_0.6.0 stats4_4.5.0 quadprog_1.5-8
[13] grid_4.5.0 tibble_3.3.0 mvtnorm_1.3-3 lifecycle_1.0.4
[17] compiler_4.5.0 emmeans_2.0.0 coda_0.19-4.1 Rcpp_1.1.0
[21] pkgconfig_2.0.3 rstudioapi_0.17.1 lattice_0.22-6 R6_2.6.1
[25] tidyselect_1.2.1 parallel_4.5.0 pillar_1.11.0 mnormt_2.1.1
[29] magrittr_2.0.3 tools_4.5.0
I have been trying to find out why the warning message appears for two months now, and I am desperate because I actually have to submit my master's thesis in two weeks.
I would be very grateful if someone could help me with this. Thank you very much!!!!