Helllo,
I have trouble using any kind of decision tree model from MLJ. I have tried 3 packages from MLJ, DecisionTree, Scikit, and now this BetaML. This only happens when I'm trying to train some kind of decision tree. I works fine with other MLJLinearModels and with XGBoost. I always get the same error. The error is coming from the following function:
function machine_train_predict(df::DataFrame, df_train::DataFrame, model_name::String; args...)
models = Dict(
"xgb_reg"=> ["XGBoost" => "XGBoostRegressor"],
"ridge_reg"=> ["MLJLinearModels" => "RidgeRegressor"],
"lasso_reg"=> ["MLJLinearModels" => "LassoRegressor"],
"rf_reg" => ["BetaML" => "RandomForestRegressor"],
"lin_reg" => ["MLJLinearModels" => "LinearRegressor"],
"log_class" => ["MLJLinearModels" => "LogisticClassifier"],
"rf_class" => ["DecisionTree" => "RandomForestClassifier"],
"xgb_class" => ["XGBoost" => "XGBoostClassifier"]
)
y, X = machine_input(df_train; rng=123)
y = coerce(y, Continuous)
mod = models[model_name][1]
p = mod[1]
m = mod[2]
mname = model_name
Model = @eval @load $(m) pkg=$(p) verbosity=0
model = Model()
# train machine and get parameters
m1 = machine(model, X, y) |> fit!
# prepare test set for machine predictions
y, X = machine_input(df)
y = coerce(y, Continuous)
# predict
yhat = MLJ.predict_mode(m1, X)
return yhat
end
And the error:
Training. Dataset: global. Iteration N: 1ERROR: LoadError: MethodError: no method matching BetaML.Bmlj.RandomForestRegressor()
The applicable method may be too new: running in world age 33750, while current world is 33793.
Closest candidates are:
BetaML.Bmlj.RandomForestRegressor(; n_trees, max_depth, min_gain, min_records, max_features, splitting_criterion, β, rng) (method too new to be called from this world context.)
@ BetaML ~/.julia/packages/BetaML/8WVUG/src/Bmlj/Trees_mlj.jl:219
BetaML.Bmlj.RandomForestRegressor(::Int64, ::Int64, ::Float64, ::Int64, ::Int64, ::Function, ::Float64, ::Random.AbstractRNG) (method too new to be called from this world context.)
@ BetaML ~/.julia/packages/BetaML/8WVUG/src/Bmlj/Trees_mlj.jl:193
BetaML.Bmlj.RandomForestRegressor(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) (method too new to be called from this world context.)
@ BetaML ~/.julia/packages/BetaML/8WVUG/src/Bmlj/Trees_mlj.jl:193
Stacktrace:
[1] (::var"#machine_train_predict#38"{var"#machine_train_predict#11#39"})(df::DataFrame, df_train::DataFrame, model_name::String; args::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:250
[2] (::var"#machine_train_predict#38"{var"#machine_train_predict#11#39"})(df::DataFrame, df_train::DataFrame, model_name::String)
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:230
[3] (::var"#train_rescore#36"{var"#train_rescore#10#37"})(df::DataFrame, df_train::DataFrame, model_name::String; args::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:223
[4] (::var"#train_rescore#36"{var"#train_rescore#10#37"})(df::DataFrame, df_train::DataFrame, model_name::String)
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:219
[5] (::var"#proto_train#32"{var"#proto_train#7#33"})(df::DataFrame, df_t::DataFrame, model_name::String; nflds::Int64, args::Base.Pairs{Symbol, Int64, Tuple{Symbol}, NamedTuple{(:nfolds,), Tuple{Int64}}})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:211
[6] (::var"#evaluate_model#42"{var"#evaluate_model#13#43"})(paths::String, output::String, dss::String, niter::Int64, model_name::String; nfolds::Int64, args::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:284
[7] (::var"#global_evaluate#40"{var"#global_evaluate#12#41"})(paths::String, output::String, ds::Vector{String}, itern::Int64, model_name::String; nfolds::Int64, args::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:270
[8] global_evaluate
@ ~/exports/10fold_ml_model-nocache-by-iter.jl:267 [inlined]
[9] main(args::Vector{String})
@ Main ~/exports/10fold_ml_model-nocache-by-iter.jl:475
[10] top-level scope
@ ~/exports/10fold_ml_model-nocache-by-iter.jl:479
The error is always related to world age and the corresponding MLJInterface of the model in use.
Please help. I have been trying to find a solution for days.
I'm trying to make predictions. The function in question corresponds to the training and prediction step of my script. I wasn't expecting to error because previous regressor models (linear, lasso, ridge, xgboost) under the MLJ framework worked fine.
I have tried just now and with a recent version of MLJ (v0.20.5) it works with the trick below:
Type this on a script Foo.jl
in an empty directory:
using Pkg
Pkg.activate(@__DIR__) # Activate the environment on the directory of this script. The first time the environment is "created" by making in this directory 2 files, Project.toml and Manifest.toml
Pkg.add("MLJ")
Pkg.add("BetaML")
using MLJ
import BetaML # <--- trick here
X = rand(100,5)
y = [r[2]+r[3]^2-r[5] for r in eachrow(X)]
model_name = "rf_reg"
function predict_y(model_name,X,y)
models = Dict(
"xgb_reg"=> ["XGBoost" => "XGBoostRegressor"],
"ridge_reg"=> ["MLJLinearModels" => "RidgeRegressor"],
"lasso_reg"=> ["MLJLinearModels" => "LassoRegressor"],
"rf_reg" => ["BetaML" => "RandomForestRegressor"],
"lin_reg" => ["MLJLinearModels" => "LinearRegressor"],
"log_class" => ["MLJLinearModels" => "LogisticClassifier"],
"rf_class" => ["DecisionTree" => "RandomForestClassifier"],
"xgb_class" => ["XGBoost" => "XGBoostClassifier"]
)
mod = models[model_name][1]
p = mod[1]
m = mod[2]
Model = @eval @load $(m) pkg=$(p) verbosity=0
model = Model()
# train machine and get parameters
m1 = machine(model, X, y) |> fit!
ŷ = predict_mode(m1, X)
return ŷ
end
ŷ = predict_y(model_name,X,y)
hcat(y,ŷ)
As I wrote in the comment, it is always best to start a project on a dedicated environment. Also BetaML works on standard arrays, not dataframes.
[EDIT] Indeed, if @evel @load is done within a function, you end up with the issue you discovered. I don't know exactly what MLJ.@load does and the reason behind it, but the trick is just to import the package providing the model, here BetaML, before calling that function.