I was trying to use Logistic regression in OCaml. I need to use it as a blackbox for another problem I'm solving. I found the following site:
http://math.umons.ac.be/anum/en/software/OCaml/Logistic_Regression/
I pasted the following code (with a few modifications - I defined my own iris_features and iris_label) from this site into a file named logistic_regression.ml:
open Scanf
open Format
open Bigarray
open Lacaml.D
let log_reg ?(lambda=0.1) x y =
(* [f_df] returns the value of the function to maximize and store
its gradient in [g]. *)
let f_df w g =
let s = ref 0. in
ignore(copy ~y:g w); (* g ← w *)
scal (-. lambda) g; (* g = -λ w *)
for i = 0 to Array.length x - 1 do
let yi = float y.(i) in
let e = exp(-. yi *. dot w x.(i)) in
s := !s +. log1p e;
axpy g ~alpha:(yi *. e /. (1. +. e)) ~x:x.(i);
done;
-. !s -. 0.5 *. lambda *. dot w w
in
let w = Vec.make0 (Vec.dim x.(0)) in
ignore(Lbfgs.F.max f_df w);
w
let iris_features = [1 ; 2 ; 3] ;;
let iris_labels = 2 ;;
let proba w x y = 1. /. (1. +. exp(-. float y *. dot w x))
let () =
let sol = log_reg iris_features iris_labels in
printf "w = %a\n" Lacaml.Io.pp_fvec sol;
let nwrongs = ref 0 in
for i = 0 to Array.length iris_features - 1 do
let p = proba sol iris_features.(i) iris_labels.(i) in
printf "Label = %i prob = %g => %s\n" iris_labels.(i) p
(if p > 0.5 then "correct" else (incr nwrongs; "wrong"))
done;
printf "Number of wrong labels: %i\n" !nwrongs
I have the following questions:
Error: Unbound module Lacaml
". I've installed Lacaml; done opam init several times, tried to provide a flag -package = Lacaml ; I don't know how to solve this?Both iris_features
and iris_labels
are arrays and array literals in OCaml are delimited with the [|
, |]
style parentheses, e.g.,
let iris_features = [|(* I don't know what to put here*)|]
let iris_labels = [|2|]
The iris_features
array has type vec array
, i.e., an array of vectors, not an array of integers, and didn't I dig too deep to know what to put there, but the syntax is the following,
let iris_features =[|
Vec.of_list [1.; 2.; 3.;];
Vec.of_list [4.; 5.; 6.;];
|]
The Lacaml interface has changed a bit since the code was written and axpy
no longer accepts labeled ~x
arguments (both x and y vectors are positional now) so you need to remove ~x
and fix the order (I presume that x.(i)
is x
in the a*x + y
expression and g
corresponds to y
, e.g.,
axpy ~alpha:(yi *. e /. (1. +. e)) x.(i) g;
This code also depends on lbfgs
, so you need to install it as well,
opam depext --install lbfgs
I would suggest you using dune as your default built system but for fast prototyping, you can use ocamlbuild
. Put your code into an empty folder in a file named regress.ml
(you can pick other name, just update the build instructions correspondingly), now you can build it to a native executable, as
ocamlbuild -pkg lacaml -pkg lbfgs regress.native
run it as
./regress.native
If you're playing in the OCaml toplevel (aka interpreter, i.e., running your code in the ocaml
interpreter), you can load lacaml
and lbfgs
using the following two directives:
#use "topfind";;
#require "lacaml.top";;
#require "lbfgs";;
(The #
is not a prompt but a part of the directive syntax, so don't forget to type it as well).
Now you can copy-paste your code into the interpreter and play with it.
regress.ml
there.open Bigarray
and open Scanf
as dune is very strict on warnings and turns them into errors (and it will warn you on those lines as they are, in fact, unused)dune init exe regress --libs lacaml,lbfgs
dune exec ./regress.exe