rr-lavaan

Turn off variable abbreviations for lavaan summaries


Is there a way to turn off variable name abbreviations when summarizing model output with lavaan?

In the below reprex, you can see that the long variables get automatically shortened. In my real life data, some of my datasets have long variable names that only differ by a single letter or number, so having the full variable name presented would benefit me. Unfortunately, I can't find an argument to specify this in the documentation.

library(lavaan)

# rename variables
names(mtcars)[names(mtcars) == "mpg"] <- "An_Incredibly_Long_Variable_Name"
names(mtcars)[names(mtcars) == "hp"] <- "An_Even_Longer_Longer_Longer_Name"
names(mtcars)[names(mtcars) == "gear"] <- "VeryVeryVeryLong"
names(mtcars)[names(mtcars) == "cyl"] <- "LongLongLongLong"

# model
model <- 'An_Incredibly_Long_Variable_Name ~ An_Even_Longer_Longer_Longer_Name + VeryVeryVeryLong + LongLongLongLong'
fit <- sem(model, "std", data = mtcars)
summary(fit)
#> lavaan 0.6.14 ended normally after 18 iterations
#> 
#>   Estimator                                       DWLS
#>   Optimization method                           NLMINB
#>   Number of model parameters                         5
#> 
#>   Number of observations                            32
#> 
#> Model Test User Model:
#>                                               Standard      Scaled
#>   Test Statistic                                 0.000       0.000
#>   Degrees of freedom                                 0           0
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                           Robust.sem
#>   Information                                 Expected
#>   Information saturated (h1) model        Unstructured
#> 
#> Regressions:
#>                                      Estimate  Std.Err  z-value  P(>|z|)
#>   An_Incredibly_Long_Variable_Name ~                                    
#>     An_Evn_L_L_L_N                     -0.039    0.012   -3.264    0.001
#>     VeryVeryVryLng                      2.023    0.693    2.920    0.003
#>     LongLongLngLng                     -1.208    0.563   -2.146    0.032
#> 
#> Intercepts:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .An_Incrd_L_V_N   25.869    4.310    6.001    0.000
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .An_Incrd_L_V_N    8.318    1.929    4.312    0.000

Solution

  • It seems that the maximum length of the name is 14 characters, including the "." and other characters that may be added. This is an adaption of your example:

    library(lavaan)
    #> This is lavaan 0.6-15
    #> lavaan is FREE software! Please report any bugs.
    
    # rename variables
    names(mtcars)[names(mtcars) == "mpg"] <-  "a234567890123"
    names(mtcars)[names(mtcars) == "hp"] <-   "a2345678901234"
    names(mtcars)[names(mtcars) == "gear"] <- "a23456789012345"
    names(mtcars)[names(mtcars) == "cyl"] <-  "a234567890123456"
    
    # model
    model <- 'a234567890123 ~ a2345678901234 + a23456789012345 + a234567890123456'
    fit <- sem(model, data = mtcars)
    summary(fit)
    #> lavaan 0.6.15 ended normally after 1 iteration
    #> 
    #>   Estimator                                         ML
    #>   Optimization method                           NLMINB
    #>   Number of model parameters                         4
    #> 
    #>   Number of observations                            32
    #> 
    #> Model Test User Model:
    #>                                                       
    #>   Test statistic                                 0.000
    #>   Degrees of freedom                                 0
    #> 
    #> Parameter Estimates:
    #> 
    #>   Standard errors                             Standard
    #>   Information                                 Expected
    #>   Information saturated (h1) model          Structured
    #> 
    #> Regressions:
    #>                    Estimate  Std.Err  z-value  P(>|z|)
    #>   a234567890123 ~                                     
    #>     a2345678901234   -0.039    0.017   -2.364    0.018
    #>     a2345678901234    2.023    0.983    2.058    0.040
    #>     a2345678901234   -1.208    0.727   -1.661    0.097
    #> 
    #> Variances:
    #>                    Estimate  Std.Err  z-value  P(>|z|)
    #>    .a234567890123     8.058    2.015    4.000    0.000
    

    The best and most reliable solution is, as Axeman suggested, shortened the names to 13 or fewer characters.

    Nevertheless, if you really have the need to use the long names in the text output, there are two options.

    The first option is to use parameterEstimates() and print the results as a table, the default. It does not abbreviate the names when printed in this format.

    # Continue from the previous example ...
    parameterEstimates(fit)
    #>                 lhs op              rhs      est    se      z pvalue ci.lower
    #> 1     a234567890123  ~   a2345678901234   -0.039 0.017 -2.364  0.018   -0.072
    #> 2     a234567890123  ~  a23456789012345    2.023 0.983  2.058  0.040    0.096
    #> 3     a234567890123  ~ a234567890123456   -1.208 0.727 -1.661  0.097   -2.634
    #> 4     a234567890123 ~~    a234567890123    8.058 2.015  4.000  0.000    4.110
    #> 5    a2345678901234 ~~   a2345678901234 4553.965 0.000     NA     NA 4553.965
    #> 6    a2345678901234 ~~  a23456789012345   -6.160 0.000     NA     NA   -6.160
    #> 7    a2345678901234 ~~ a234567890123456   98.746 0.000     NA     NA   98.746
    #> 8   a23456789012345 ~~  a23456789012345    0.527 0.000     NA     NA    0.527
    #> 9   a23456789012345 ~~ a234567890123456   -0.629 0.000     NA     NA   -0.629
    #> 10 a234567890123456 ~~ a234567890123456    3.090 0.000     NA     NA    3.090
    #>    ci.upper
    #> 1    -0.007
    #> 2     3.951
    #> 3     0.217
    #> 4    12.007
    #> 5  4553.965
    #> 6    -6.160
    #> 7    98.746
    #> 8     0.527
    #> 9    -0.629
    #> 10    3.090
    

    The second option is not an officially supported solution. However, it works for version 0.6-15. Include at least one character in a variable that would make the name encoded as a multi-byte string. It seems that, when the parameter estimates are printed in the text format, if at least one of the names is encoded as a multi-byte string, then all names will not be abbreviated:

    library(lavaan)
    #> This is lavaan 0.6-15
    #> lavaan is FREE software! Please report any bugs.
    
    # rename variables
    names(mtcars)[names(mtcars) == "mpg"] <-  "a234567890123"
    names(mtcars)[names(mtcars) == "hp"] <-   "a2345678901234"
    names(mtcars)[names(mtcars) == "gear"] <- "a23456789012345"
    # Note the last character
    names(mtcars)[names(mtcars) == "cyl"] <-  "a23456789012345678901234567890é"
    
    # model
    model <- 'a234567890123 ~ a2345678901234 + a23456789012345 + a23456789012345678901234567890é'
    fit <- sem(model, data = mtcars)
    summary(fit)
    #> lavaan 0.6.15 ended normally after 1 iteration
    #> 
    #>   Estimator                                         ML
    #>   Optimization method                           NLMINB
    #>   Number of model parameters                         4
    #> 
    #>   Number of observations                            32
    #> 
    #> Model Test User Model:
    #>                                                       
    #>   Test statistic                                 0.000
    #>   Degrees of freedom                                 0
    #> 
    #> Parameter Estimates:
    #> 
    #>   Standard errors                             Standard
    #>   Information                                 Expected
    #>   Information saturated (h1) model          Structured
    #> 
    #> Regressions:
    #>                                     Estimate  Std.Err  z-value  P(>|z|)
    #>   a234567890123 ~                                                      
    #>     a2345678901234                    -0.039    0.017   -2.364    0.018
    #>     a23456789012345                    2.023    0.983    2.058    0.040
    #>     a23456789012345678901234567890é   -1.208    0.727   -1.661    0.097
    #> 
    #> Variances:
    #>                    Estimate  Std.Err  z-value  P(>|z|)
    #>    .a234567890123     8.058    2.015    4.000    0.000
    

    However, you can see that the alignment of the columns is affected. Therefore, the abbreviation is done for readability, I believe.