scalaapache-sparknlpsbtjohnsnowlabs-spark-nlp

libraryDependencies for `TFNerDLGraphBuilder()` for Spark with Scala


Can anyone tell what is libraryDependencies for TFNerDLGraphBuilder() for Spark with Scala? It gives me error, Cannot resolve symbol TFNerDLGraphBuilder

I see it works for notebook as given below

https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/4.NERDL_Training.ipynb


Solution

  • TensorFlow graphs in Spark NLP are built using TF python api. As far as I know, the java version for creating the Conv1D/BiLSTM/CRC graph is not included.

    So, you need to create it first following the instructions in:

    https://nlp.johnsnowlabs.com/docs/en/training#tensorflow-graphs

    That will create a pb TensorFlow file that you have to include in the NerDLApproach annotator. For example:

    val nerTagger = new NerDLApproach()
      .setInputCols("sentence", "token", "embeddings")
      .setOutputCol("ner")
      .setLabelColumn("label")
      .setMaxEpochs(100)
      .setRandomSeed(0)
      .setPo(0.03f)
      .setLr(0.2f)
      .setDropout(0.5f)
      .setBatchSize(100)
      .setVerbose(Verbose.Epochs)
      .setGraphFolder(TfGrpahPath)
    

    Note that you have to include the embedding annotation first and that the training process will be executed in the driver. It is not distributed as it could be with BigDL.