ibm-cloudibm-watsonwatson-discovery

IBM Cloud Watson Discovery: Relevancy training never runs successfully


I uploaded a CSV file containing 9 documents to a collection in Watson Discovery. I've tried searching this collection with some queries but the confidences are really low(0.01 -> 0.02), despite returning the correct document. That led me to Relevancy training. I input around 60 questions and rate the returning results (on the Improvement tools panel). However, it seems to me that the training never starts. IBM keeps showing "IBM will begin learning soon". Here is the project status checked by python-sdk API. It has been like this for a couple of days. enter image description here

My questions are:

  1. What could be possibly wrong with the relevancy training that lead to the training process not running?
  2. Is confidence of 0.01 -> 0.02 normal for an untrained collection (untrained strategy)?

Thank you in advance.


Solution

  • It turns out that the format of the document is off. My coworker uploaded a CSV file with HTML code and IBM Discovery doesn't seem to like it.

    I converted them to a set of pdf files and it works.