ocamlocamlbuildocamlfind

Logistic Regression in OCaml


I was trying to use Logistic regression in OCaml. I need to use it as a blackbox for another problem I'm solving. I found the following site:

http://math.umons.ac.be/anum/en/software/OCaml/Logistic_Regression/

I pasted the following code (with a few modifications - I defined my own iris_features and iris_label) from this site into a file named logistic_regression.ml:

open Scanf
open Format
open Bigarray
open Lacaml.D

let log_reg ?(lambda=0.1) x y =
  (* [f_df] returns the value of the function to maximize and store
     its gradient in [g]. *)
  let f_df w g =
    let s = ref 0. in
    ignore(copy ~y:g w); (* g ← w *)
    scal (-. lambda) g;  (* g = -λ w *)
    for i = 0 to Array.length x - 1 do
      let yi = float y.(i) in
      let e = exp(-. yi *. dot w x.(i)) in
      s := !s +. log1p e;
      axpy g ~alpha:(yi *. e /. (1. +. e)) ~x:x.(i);
    done;
    -. !s -. 0.5 *. lambda *. dot w w
  in
  let w = Vec.make0 (Vec.dim x.(0)) in
  ignore(Lbfgs.F.max f_df w);
  w


let iris_features = [1 ; 2 ; 3] ;;
let iris_labels = 2 ;;

let proba w x y = 1. /. (1. +. exp(-. float y *. dot w x))
let () =
  let sol = log_reg iris_features iris_labels in
  printf "w = %a\n" Lacaml.Io.pp_fvec sol;
  let nwrongs = ref 0 in
  for i = 0 to Array.length iris_features - 1 do
    let p = proba sol iris_features.(i) iris_labels.(i) in
    printf "Label = %i prob = %g => %s\n" iris_labels.(i) p
      (if p > 0.5 then "correct" else (incr nwrongs; "wrong"))
  done;
  printf "Number of wrong labels: %i\n" !nwrongs

I have the following questions:

  1. On trying to compile the code, I get the error message: "Error: Unbound module Lacaml". I've installed Lacaml; done opam init several times, tried to provide a flag -package = Lacaml ; I don't know how to solve this?
  2. As you can see I've defined my own version of iris_features and iris_labels - are the types correct i.e. in the function log_reg is the type of x int list and that of y as int?

Solution

  • Both iris_features and iris_labels are arrays and array literals in OCaml are delimited with the [|, |] style parentheses, e.g.,

    
    let iris_features = [|(* I don't know what to put here*)|]
    let iris_labels = [|2|]
    

    The iris_features array has type vec array, i.e., an array of vectors, not an array of integers, and didn't I dig too deep to know what to put there, but the syntax is the following,

    let iris_features =[|
      Vec.of_list [1.; 2.; 3.;];
      Vec.of_list [4.; 5.; 6.;];
    |]
    

    The Lacaml interface has changed a bit since the code was written and axpy no longer accepts labeled ~x arguments (both x and y vectors are positional now) so you need to remove ~x and fix the order (I presume that x.(i) is x in the a*x + y expression and g corresponds to y, e.g.,

     axpy ~alpha:(yi *. e /. (1. +. e)) x.(i) g;
    

    This code also depends on lbfgs, so you need to install it as well,

    opam depext --install lbfgs
    

    I would suggest you using dune as your default built system but for fast prototyping, you can use ocamlbuild. Put your code into an empty folder in a file named regress.ml (you can pick other name, just update the build instructions correspondingly), now you can build it to a native executable, as

    ocamlbuild -pkg lacaml -pkg lbfgs regress.native
    

    run it as

    ./regress.native
    

    If you're playing in the OCaml toplevel (aka interpreter, i.e., running your code in the ocaml interpreter), you can load lacaml and lbfgs using the following two directives:

    #use "topfind";;
    #require "lacaml.top";;
    #require "lbfgs";;
    

    (The # is not a prompt but a part of the directive syntax, so don't forget to type it as well).

    Now you can copy-paste your code into the interpreter and play with it.

    Bonus Track - building with dune

    1. create an empty folder and a put regress.ml there.
    2. remove open Bigarray and open Scanf as dune is very strict on warnings and turns them into errors (and it will warn you on those lines as they are, in fact, unused)
    3. create the dune project
    dune init exe regress --libs lacaml,lbfgs
    
    1. build and run
    dune exec ./regress.exe