We are using IBM discovery in a bot. We have trained the collection with relevancy and non-relevancy score. We are getting the confidence score for each document. We are using this confidence score as a threshold value to handle different user queries in our bot.
The observation since the past week has been that the Discovery at random times, stops sending the confidence score in the API call’s json. Due to this, our bot is not able to answer even simple questions as we have added a confidence score threshold. Then all of a sudden on its own, its starts sending the confidence score. This has happened 2-3 times in the past week. At our discovery console end, it says that the collection is trained. Need to know what triggers such behavior and if there is some bug fix for this?
There was recently a note added to the documentation that addressees this issue: "Note: The confidence field is only returned when relevancy training has been successfully completed. There may also be cases where the trained model is not available and the confidence field will not be returned. Applications using confidence as a threshold should ensure they can handle these scenarios. Since score is relative to the query, it is not recommended for use as a fixed threshold. Instead, we recommend that applications always perform the same behavior for all results that do not include the confidence field. For example, an application may show all results without the confidence field or hide all results without the confidence field, but should not use the value of score to show some and hide others." (emphasis mine)
The note doesn't address the underlying causes, but as I understand it, it is mostly about speed. Sometimes due to some combination of server load, query complexity, and document complexity, it takes too long to compute confidence, so in order to get some results back to the calling application fast enough, Discovery will just send the results back without the confidence.