Is there a way to turn off variable name abbreviations when summarizing model output with lavaan
?
In the below reprex, you can see that the long variables get automatically shortened. In my real life data, some of my datasets have long variable names that only differ by a single letter or number, so having the full variable name presented would benefit me. Unfortunately, I can't find an argument to specify this in the documentation.
library(lavaan)
# rename variables
names(mtcars)[names(mtcars) == "mpg"] <- "An_Incredibly_Long_Variable_Name"
names(mtcars)[names(mtcars) == "hp"] <- "An_Even_Longer_Longer_Longer_Name"
names(mtcars)[names(mtcars) == "gear"] <- "VeryVeryVeryLong"
names(mtcars)[names(mtcars) == "cyl"] <- "LongLongLongLong"
# model
model <- 'An_Incredibly_Long_Variable_Name ~ An_Even_Longer_Longer_Longer_Name + VeryVeryVeryLong + LongLongLongLong'
fit <- sem(model, "std", data = mtcars)
summary(fit)
#> lavaan 0.6.14 ended normally after 18 iterations
#>
#> Estimator DWLS
#> Optimization method NLMINB
#> Number of model parameters 5
#>
#> Number of observations 32
#>
#> Model Test User Model:
#> Standard Scaled
#> Test Statistic 0.000 0.000
#> Degrees of freedom 0 0
#>
#> Parameter Estimates:
#>
#> Standard errors Robust.sem
#> Information Expected
#> Information saturated (h1) model Unstructured
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|)
#> An_Incredibly_Long_Variable_Name ~
#> An_Evn_L_L_L_N -0.039 0.012 -3.264 0.001
#> VeryVeryVryLng 2.023 0.693 2.920 0.003
#> LongLongLngLng -1.208 0.563 -2.146 0.032
#>
#> Intercepts:
#> Estimate Std.Err z-value P(>|z|)
#> .An_Incrd_L_V_N 25.869 4.310 6.001 0.000
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .An_Incrd_L_V_N 8.318 1.929 4.312 0.000
It seems that the maximum length of the name is 14 characters, including the "." and other characters that may be added. This is an adaption of your example:
library(lavaan)
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
# rename variables
names(mtcars)[names(mtcars) == "mpg"] <- "a234567890123"
names(mtcars)[names(mtcars) == "hp"] <- "a2345678901234"
names(mtcars)[names(mtcars) == "gear"] <- "a23456789012345"
names(mtcars)[names(mtcars) == "cyl"] <- "a234567890123456"
# model
model <- 'a234567890123 ~ a2345678901234 + a23456789012345 + a234567890123456'
fit <- sem(model, data = mtcars)
summary(fit)
#> lavaan 0.6.15 ended normally after 1 iteration
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 4
#>
#> Number of observations 32
#>
#> Model Test User Model:
#>
#> Test statistic 0.000
#> Degrees of freedom 0
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|)
#> a234567890123 ~
#> a2345678901234 -0.039 0.017 -2.364 0.018
#> a2345678901234 2.023 0.983 2.058 0.040
#> a2345678901234 -1.208 0.727 -1.661 0.097
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .a234567890123 8.058 2.015 4.000 0.000
The best and most reliable solution is, as Axeman suggested, shortened the names to 13 or fewer characters.
Nevertheless, if you really have the need to use the long names in the text output, there are two options.
The first option is to use parameterEstimates()
and print the results as a table, the default. It does not abbreviate the names when printed in this format.
# Continue from the previous example ...
parameterEstimates(fit)
#> lhs op rhs est se z pvalue ci.lower
#> 1 a234567890123 ~ a2345678901234 -0.039 0.017 -2.364 0.018 -0.072
#> 2 a234567890123 ~ a23456789012345 2.023 0.983 2.058 0.040 0.096
#> 3 a234567890123 ~ a234567890123456 -1.208 0.727 -1.661 0.097 -2.634
#> 4 a234567890123 ~~ a234567890123 8.058 2.015 4.000 0.000 4.110
#> 5 a2345678901234 ~~ a2345678901234 4553.965 0.000 NA NA 4553.965
#> 6 a2345678901234 ~~ a23456789012345 -6.160 0.000 NA NA -6.160
#> 7 a2345678901234 ~~ a234567890123456 98.746 0.000 NA NA 98.746
#> 8 a23456789012345 ~~ a23456789012345 0.527 0.000 NA NA 0.527
#> 9 a23456789012345 ~~ a234567890123456 -0.629 0.000 NA NA -0.629
#> 10 a234567890123456 ~~ a234567890123456 3.090 0.000 NA NA 3.090
#> ci.upper
#> 1 -0.007
#> 2 3.951
#> 3 0.217
#> 4 12.007
#> 5 4553.965
#> 6 -6.160
#> 7 98.746
#> 8 0.527
#> 9 -0.629
#> 10 3.090
The second option is not an officially supported solution. However, it works for version 0.6-15. Include at least one character in a variable that would make the name encoded as a multi-byte string. It seems that, when the parameter estimates are printed in the text format, if at least one of the names is encoded as a multi-byte string, then all names will not be abbreviated:
library(lavaan)
#> This is lavaan 0.6-15
#> lavaan is FREE software! Please report any bugs.
# rename variables
names(mtcars)[names(mtcars) == "mpg"] <- "a234567890123"
names(mtcars)[names(mtcars) == "hp"] <- "a2345678901234"
names(mtcars)[names(mtcars) == "gear"] <- "a23456789012345"
# Note the last character
names(mtcars)[names(mtcars) == "cyl"] <- "a23456789012345678901234567890é"
# model
model <- 'a234567890123 ~ a2345678901234 + a23456789012345 + a23456789012345678901234567890é'
fit <- sem(model, data = mtcars)
summary(fit)
#> lavaan 0.6.15 ended normally after 1 iteration
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 4
#>
#> Number of observations 32
#>
#> Model Test User Model:
#>
#> Test statistic 0.000
#> Degrees of freedom 0
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Regressions:
#> Estimate Std.Err z-value P(>|z|)
#> a234567890123 ~
#> a2345678901234 -0.039 0.017 -2.364 0.018
#> a23456789012345 2.023 0.983 2.058 0.040
#> a23456789012345678901234567890é -1.208 0.727 -1.661 0.097
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .a234567890123 8.058 2.015 4.000 0.000
However, you can see that the alignment of the columns is affected. Therefore, the abbreviation is done for readability, I believe.