[SOLVED] Extract entities without specifying during intent specification

Extract entities without specifying during intent specification

I am using Rasa 2.0 to build an FAQ chatbot, wherein I have a large dataset, and specifying entities while defining intents does not seem efficient to me.

I have the intents and examples defined in nlu.yml and would like to extract entities.

Here is an example of what I want to achieve,

User message -> I want a hospital in Delhi. Entity -> Delhi, hospital

Is it possible to do so?

Solution

Entity detection is not a solved problem. There exist pre-trained models that integrate with Rasa like Duckling and spaCy and while these tools certainly contribute a lot of knowledge, they will make errors. If you're interested in learning more of the background on why these models can certainly fail, you can enjoy this youtube video that explains human name detection.

That's why a popular alternative is to use name-lists. There are lists of cities around the world as well as lists of baby names that you can download that might be used as a rule based alternative. You can configure this in Rasa via the RegexEntityExtractor but if you have namelists with 1000+ items then a FlashTextExtractor might be preferable.

If you've got labelled examples you can also train Rasa itself to recognise the entities. But in order to do this you will to have labels around.

specifying entities while defining intents does not seem efficient to me

Labelling might not be super fun, but it is super effective. Without labelling your received utterances you won't know what intents your users are interested in.