scalaapache-sparkapache-spark-mllibevaluator

Adding params to Apache Spark's abstract Evaluator class


I'm building a Spark Application and am using the Evaluator class within some custom Estimators/Transformers. I've run into an issue where the Evaluator base class does not implement the metricName param that most (all?) of its descendants (RegressionEvaluator, BinaryClassificationEvaluator, etc.) implement. Specifically, when writing a Validator that takes an Estimator and an Evaluator as parameters (to .fit and then .evaluate a trained model), I would like to be able to store or log both the resulting metricName and metricValue produced by the Evaluator. That would be pretty straightforward if I explicitly typed the evaluator param as either a RegressionEvaluator or BinaryClassificationEvaluator, but if I type it simply as the base Evaluator, then I get val getMetricName is not a member of ... error at compile. I can potentially extend the Evaluator class and use that (as below), but that creates some type mismatch headaches in other areas (found Evaluator, required MyEvaluator) that are not worth the benefit of just making the metricName available.

trait HasMetricName extends Params {
  val metricName: Param[String]

  def getMetricName: String = $(metricName)
}

abstract class MyEvaluator extends Evaluator with HasMetricName

So, my question: Is there a simple way that I can mixin or tell the compiler that my Evaluator class has a metricName value or else insert the MyEvaluator class in a way that won't propogate type mismatch errors in other areas? The ideal solution would be to simply edit the Evaluator class to include this param, but that is buried in a top level Apache project.


Solution

  • Seems like a perfect case for a structural type:

    def myMethod(e: Evaluator { def getMetricName: String }): ...