machine-learningnlpchatbotazure-language-understandingdisambiguation

LUIS - Similar training utterances for two chatbot intents


PREMISE: I am using the ML Luis framework for the development of a chatbot. Which is basically a blackbox framework and I don't know how to tune it in the right way for this problem.

I have two intents/classes for my chatbot. For simplicity say:

are the two said intents. In my training set I have for the two classes:

like class:

I like it

I like cats

I really like mouses

don't like class:

I don't like it

I don't like dolphins

I really don't like dogs

The two classes are really similar for the training set phrases, and when I try to do some predictions on a phrase belonging to one of the two classes, the scores are really close, say, for example:

 I like armadillos -> 0.86 like | 0.8 don't like

Basically the two domains/classes have a big overlapping and differ for only one word (don't as shown in the examples). Is there a way to train the model efficiently (using Luis1) increasing the scores difference across similar utterances?


Solution

  • LUIS mainly uses a conditional random field (CRF) to extract entities (see here). As in CRF probabilities are computed base on the sequence of words, in your case you can't change any factor in LUIS. Because the sequence of words in both cases are very similar.

    To solve this you can do some process out of the LUIS or prepare much more utterances for LUIS to recognize the difference. However, the latter solution might not help that much as I have explained in the first paragraph.