I intent to run a lightgbm model in R using tidymodels. I installed bonsai in mac like usual but I'm finding trouble doing it on my windows machine. It also works well in Posit Cloud. I tried the install.packages and the pak:: method like instructed in the repo https://github.com/tidymodels/bonsai
Here is what I have tried and the erros I'm getting:
install.packages("bonsai")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/dani1/AppData/Local/R/win-library/4.2’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/bonsai_0.2.1.zip'
Content type 'application/zip' length 158788 bytes (155 KB)
downloaded 155 KB
package ‘bonsai’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\dani1\AppData\Local\Temp\RtmpOmjyZF\downloaded_packages
> library(bonsai)
Error: package or namespace load failed for ‘bonsai’:
.onLoad failed in loadNamespace() for 'bonsai', details:
call: is_discordant_info(model, mode, eng, new_fit)
error: The combination of engine 'lightgbm' and mode 'regression' already has fit data for model 'boost_tree' and the new information being registered is different.
In addition: Warning message:
package ‘bonsai’ was built under R version 4.2.3
> **pak::pak("tidymodels/bonsai")**
→ Will update 1 package.
→ Will download 1 package with unknown size.
+ bonsai 0.2.1 → 0.2.1.9000 [bld][cmp][dl] (GitHub: aab79d5)
? Do you want to continue (Y/n) Y
ℹ Getting 1 pkg with unknown size
✔ Cached copy of bonsai 0.2.1.9000 (source) is the latest build
✔ No downloads needed, all packages are cached
ℹ Packaging bonsai 0.2.1.9000
✔ Packaged bonsai 0.2.1.9000 (1.1s)
ℹ Building bonsai 0.2.1.9000
✔ Built bonsai 0.2.1.9000 (5.4s)
✔ Installed bonsai 0.2.1.9000 (github::tidymodels/bonsai@aab79d5) (127ms)
✔ 1 pkg + 43 deps: kept 27, upd 1 [19.3s]
> library(bonsai)
Error: package or namespace load failed for ‘bonsai’ in get(method, envir = envir):
lazy-load database 'C:/Users/dani1/AppData/Local/R/win-library/4.2/bonsai/R/bonsai.rdb' is corrupt
In addition: Warning message:
In get(method, envir = envir) : internal error -3 in R_decompress1
> doParallel::registerDoParallel()
> light_grid <- tune_grid(
+ light_wf,
+ resamples = cv_folds,
+ grid = 10,
+ control = control_grid(save_pred = T)
+ )
i Creating pre-processing data to finalize unknown parameter: mtry
Error in `check_installs()`:
! Some package installs are required:
• 'bonsai', 'bonsai'
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: In get(method, envir = envir) : internal error -3 in R_decompress1
2: In get(method, envir = envir) : internal error -3 in R_decompress1
>
Does anyone has a workaround this?
Edit: Repeat and give the results of sessionInfo()
> # Remove Package
> remove.packages("bonsai")
Removing package from ‘C:/Users/dani1/AppData/Local/R/win-library/4.2’
(as ‘lib’ is unspecified)
> # Reinstall CRAN
> install.packages("bonsai")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/dani1/AppData/Local/R/win-library/4.2’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/bonsai_0.2.1.zip'
Content type 'application/zip' length 158788 bytes (155 KB)
downloaded 155 KB
package ‘bonsai’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\dani1\AppData\Local\Temp\RtmpOmjyZF\downloaded_packages
> #load
> library(bonsai)
Error: package or namespace load failed for ‘bonsai’:
.onLoad failed in loadNamespace() for 'bonsai', details:
call: is_discordant_info(model, mode, eng, new_fit)
error: The combination of engine 'lightgbm' and mode 'regression' already has fit data for model 'boost_tree' and the new information being registered is different.
In addition: Warning message:
package ‘bonsai’ was built under R version 4.2.3
> # pak
> pak::pak("tidymodels/bonsai")
→ Will update 1 package.
→ Will download 1 package with unknown size.
+ bonsai 0.2.1 → 0.2.1.9000 [bld][cmp][dl] (GitHub: aab79d5)
? Do you want to continue (Y/n) Y
ℹ Getting 1 pkg with unknown size
✔ Cached copy of bonsai 0.2.1.9000 (source) is the latest build
✔ No downloads needed, all packages are cached
ℹ Packaging bonsai 0.2.1.9000
✔ Packaged bonsai 0.2.1.9000 (1.2s)
ℹ Building bonsai 0.2.1.9000
✔ Built bonsai 0.2.1.9000 (7.7s)
✔ Installed bonsai 0.2.1.9000 (github::tidymodels/bonsai@aab79d5) (253ms)
✔ 1 pkg + 43 deps: kept 27, upd 1 [19.3s]
> #load
> library(bonsai)
Error: package or namespace load failed for ‘bonsai’ in get(method, envir = envir):
lazy-load database 'C:/Users/dani1/AppData/Local/R/win-library/4.2/bonsai/R/bonsai.rdb' is corrupt
In addition: Warning message:
In get(method, envir = envir) : internal error -3 in R_decompress1
> # session
> sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8
[3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] themis_1.0.0 forcats_0.5.1 stringr_1.5.0
[4] readr_2.1.2 tidyverse_1.3.1 yardstick_1.0.0
[7] workflowsets_1.0.0 workflows_1.0.0 tune_1.0.0
[10] tidyr_1.3.0 tibble_3.1.8 rsample_1.0.0
[13] recipes_1.0.1 purrr_1.0.1 parsnip_1.0.4
[16] modeldata_1.0.0 infer_1.0.2 ggplot2_3.3.6
[19] dplyr_1.1.0 dials_1.0.0 scales_1.2.0
[22] broom_1.0.0 tidymodels_1.0.0
loaded via a namespace (and not attached):
[1] colorspace_2.0-3 ellipsis_0.3.2 class_7.3-20
[4] fs_1.5.2 rstudioapi_0.14 listenv_0.8.0
[7] furrr_0.3.1 remotes_2.4.2 bit64_4.0.5
[10] prodlim_2019.11.13 fansi_1.0.4 lubridate_1.8.0
[13] xml2_1.3.3 codetools_0.2-18 splines_4.2.0
[16] doParallel_1.0.17 jsonlite_1.8.4 dbplyr_2.2.0
[19] compiler_4.2.0 httr_1.4.4 backports_1.4.1
[22] assertthat_0.2.1 Matrix_1.5-1 cli_3.6.0
[25] tools_4.2.0 gtable_0.3.0 glue_1.6.2
[28] Rcpp_1.0.10 cellranger_1.1.0 DiceDesign_1.9
[31] vctrs_0.5.2 tree_1.0-43 iterators_1.0.14
[34] timeDate_3043.102 gower_1.0.0 globals_0.16.2
[37] rvest_1.0.3 lifecycle_1.0.3 pak_0.4.0
[40] future_1.26.1 MASS_7.3-56 ipred_0.9-13
[43] vroom_1.5.7 hms_1.1.1 parallel_4.2.0
[46] lightgbm_3.3.5 rpart_4.1.16 stringi_1.7.12
[49] foreach_1.5.2 lhs_1.1.5 hardhat_1.2.0
[52] lava_1.6.10 rlang_1.0.6 pkgconfig_2.0.3
[55] lattice_0.20-45 bit_4.0.4 tidyselect_1.2.0
[58] parallelly_1.32.0 magrittr_2.0.3 R6_2.5.1
[61] generics_0.1.3 DBI_1.1.3 pillar_1.8.1
[64] haven_2.5.0 withr_2.5.0 survival_3.3-1
[67] nnet_7.3-17 future.apply_1.9.0 ROSE_0.0-4
[70] modelr_0.1.8 crayon_1.5.1 utf8_1.2.3
[73] tzdb_0.3.0 grid_4.2.0 readxl_1.4.1
[76] data.table_1.14.2 reprex_2.0.1 digest_0.6.29
[79] GPfit_1.0-8 munsell_0.5.0
Could you start off by restarting your R session before calling library(bonsai)
? I would anticipate this is the cause of the is_discordant_info()
error.
If that doesn't do the trick, you're seeing the warning:
RTools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding.
RTools is a set of packages needed on Windows to build R packages from source. Here are some instructions for installing RTools—work through those, restart R, and you should be good to go.