pythonnlpspacyazure-language-understandingnamed-entity-extraction

entity detection - entities clashing with english words


I have few sentences like below

In the above sentences, the entity I'm looking for is IS, IS and ME respectively. These entities include, IS, ME, AN, AM which are common while constructing a sentence in English. I'm using LUIS for entity detection and maintaining the entities as a list entity. The issue is that, though LUIS is able to detect the entities (IS,AN,AM) its detecting them on the normal sentences like

In the above sentence, we do not have any entity but the entity IS is picked up.

How do we detect the entities only if they're addressed actually and not a part of sentence construction.

Few points to note:


Solution

  • You've probably figured out that non-machine-learned entities are not ideal in your case because they don't take context into consideration. I think you have a few options.

    Option 1: Simple Entities

    I just tested by adding your three utterances to an intent named "Sales org" and then creating a simple entity named "Scope." I labeled IS, IS, and ME at the ends of those utterances as the Scope entity. LUIS was then able to correctly identify "is" as the entity but not "me" when I tested "give me sales org for fpc 12234 for is?"

    After making a call to LUIS, your bot code can then validate the recognized entity to make sure it's within the list of acceptable values.

    Option 2: Roles

    If you still want to use a list entity, you can still have LUIS give you contextual information about the entity by using roles.

    I just tested by creating an entity named "ScopeName" with your four values IS, ME, AN, and AM. I then created two roles for that entity: "scope" and "falsePositive." Then I labeled the entities in the "Sales org" utterances like this:

    enter image description here

    If you do this, LUIS will still recognize IS, ME, AN, and AM when they're in the parts of the sentence where you don't want them to be recognized, but you'll know to ignore them in your bot code because they'll be assigned the "falsePositive" role.