tensorflowlookup-tablesrasa-nlurasanamed-entity-extraction

Rasa lookup table throws error in training data - " Not a valid NLU data"


I'm building a weatherbot using RASA and want my model to be able to extract locations as entities . My model did not recognize locations that were outside of the training data. So , I decided to use lookup tables for additional entities .

I followed the blog article Entity extraction with lookup tables and used a lookup table to recognize location entities in my training data by creating a lookup ‘.txt’ file under the data folder .

This is how my training data looks like :

{
  "rasa_nlu_data": {
    "lookup_tables": [
    {
        "name": "location",
        "elements": "data/location.txt"
    }
]
    "common_examples": [
      {
        "text": "Hello",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "goodbye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "What's the weather in Berlin at the moment?",
        "intent": "inform",
        "entities": [
          {
            "start": 22,
            "end": 28,
            "value": "Berlin",
            "entity": "location"
          }
        ]
      },
      {
        "text": "hey",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "hello",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "hi",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "heya",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "howdy",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "hey there",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "goodbye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "bye bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "see ya",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "see you later",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "What's the weather today?",
        "intent": "inform",
        "entities": []
      },
      {
        "text": "What's the weather in London today?",
        "intent": "inform",
        "entities": [
          {
            "start": 22,
            "end": 28,
            "value": "London",
            "entity": "location"
          }
        ]
      },
      {
        "text": "Show me what's the weather in Paris",
        "intent": "inform",
        "entities": [
          {
            "start": 30,
            "end": 35,
            "value": "Paris",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I wonder what is the weather in Vilnius right now?",
        "intent": "inform",
        "entities": [
          {
            "start": 32,
            "end": 39,
            "value": "Vilnius",
            "entity": "location"
          }
        ]
      },
      {
        "text": "what is the weather?",
        "intent": "inform",
        "entities": []
      },
      {
        "text": "Tell me the weather",
        "intent": "inform",
        "entities": []
      },
      {
        "text": "Is the weather nice in Barcelona today?",
        "intent": "inform",
        "entities": [
          {
            "start": 23,
            "end": 32,
            "value": "Barcelona",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am going to London today and I wonder what is the weather out there?",
        "intent": "inform",
        "entities": [
          {
            "start": 14,
            "end": 20,
            "value": "London",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am planning my trip to Amsterdam. What is the weather out there?",
        "intent": "inform",
        "entities": [
          {
            "start": 25,
            "end": 34,
            "value": "Amsterdam",
            "entity": "location"
          }
        ]
      },
      {
        "text": "Show me the weather in Dublin, please",
        "intent": "inform",
        "entities": [
          {
            "start": 23,
            "end": 29,
            "value": "Dublin",
            "entity": "location"
          }
        ]
      },
      {
        "text": "in London",
        "intent": "inform",
        "entities": [
          {
            "start": 3,
            "end": 9,
            "value": "London",
            "entity": "location"
          }
        ]
      },
      {
        "text": "Lithuania",
        "intent": "inform",
        "entities": [
          {
            "start": 0,
            "end": 9,
            "value": "Lithuania",
            "entity": "location"
          }
        ]
      },
      {
        "text": "Oh, sorry, in Italy",
        "intent": "inform",
        "entities": [
          {
            "start": 14,
            "end": 19,
            "value": "Italy",
            "entity": "location"
          }
        ]
      },
      {
        "text": "Tell me the weather in Vilnius",
        "intent": "inform",
        "entities": [
          {
            "start": 23,
            "end": 30,
            "value": "Vilnius",
            "entity": "location"
          }
        ]
      },
      {
        "text": "The weather condition in Italy",
        "intent": "inform",
        "entities": [
          {
            "start": 25,
            "end": 30,
            "value": "Italy",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am going to Barcelona",
        "intent": "inform",
        "entities": [
          {
            "start": 14,
            "end": 23,
            "value": "Barcelona",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I'm planning a trip to Barcelona",
        "intent": "inform",
        "entities": [
          {
            "start": 22,
            "end": 32,
            "value": " Barcelona",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am going to Barcelona. I wonder what is the weather out there?",
        "intent": "inform",
        "entities": [
          {
            "start": 13,
            "end": 23,
            "value": " Barcelona",
            "entity": "location"
          }
        ]
      },
      {
        "text": "What is the weather in Argentina?",
        "intent": "inform",
        "entities": [
          {
            "start": 23,
            "end": 32,
            "value": "Argentina",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am planning a trip to Paris and I wonder what is the weather out there?",
        "intent": "inform",
        "entities": [
          {
            "start": 24,
            "end": 29,
            "value": "Paris",
            "entity": "location"
          }
        ]
      }
    ]
  }
}

My lookup table has names of locations each in a new line .

But , When I run ’ rasa train nlu ’ , I’m getting the following error : "Path ‘data’ doesn’t contain valid NLU data in it. Please verify the data format. The NLU model training will be skipped now." What could be the reason ? I only did what was told in the blog. Please tell me where I’m going wrong . Thanks in advance


Solution

  • This is actually unrelated to the lookups, by loading your data I could see you are missing a comma between lookups and common examples, it should look like this:

    {
      "rasa_nlu_data": {
        "lookup_tables": [
        {
            "name": "location",
            "elements": "data/location.txt"
        }
    ],
        "common_examples"
    ...
    

    this indicates you may be writing/editing json manually; consider using markdown and using rasa data convert to get json if you need it