I was trying the entity linking example in spacy.
This is the information about spaCy in my system.
============================== Info about spaCy ==============================
spaCy version 2.2.2
Location C:\Users\manimaran.p\AppData\Local\Continuum\anaconda3\envs\spacy\lib\site-packages\spacy
Platform Windows-8.1-6.3.9600-SP0
Python version 3.7.3
Models
Using this example to train the entity linker and generating the knowledge base for the same with this example.
I can create a knowledge base with the available en_core_web_md, this is the output for the same.
# python "create kb.py" -m en_core_web_md -o pret_kb
Loaded model 'en_core_web_md'
2 kb entities: ['Q2146908', 'Q7381115']
1 kb aliases: ['Russ Cochran']
Saved KB to pret_kb\kb
Saved vocab to pret_kb\vocab
Loading vocab from pret_kb\vocab
Loading KB from pret_kb\kb
2 kb entities: ['Q2146908', 'Q7381115']
1 kb aliases: ['Russ Cochran']
When I try to train the entity linker with the knowledge base from above, I get this error.
# python "entity linker.py" ./pret_kb/kb ./pret_kb/vocab
Created blank 'en' model with vocab from 'pret_kb\vocab'
Loaded Knowledge Base from 'pret_kb\kb'
Traceback (most recent call last):
File "entity linker.py", line 156, in <module>
plac.call(main)
File "C:\Users\manimaran.p\AppData\Local\Continuum\anaconda3\envs\spacy\lib\site-packages\plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "C:\Users\manimaran.p\AppData\Local\Continuum\anaconda3\envs\spacy\lib\site-packages\plac_core.py", line 207, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "entity linker.py", line 113, in main
sgd=optimizer,
File "C:\Users\manimaran.p\AppData\Local\Continuum\anaconda3\envs\spacy\lib\site-packages\spacy\language.py", line 515, in update
proc.update(docs, golds, sgd=get_grads, losses=losses, **kwargs)
File "pipes.pyx", line 1219, in spacy.pipeline.pipes.EntityLinker.update
KeyError: (0, 12)
I did follow the instructions specified here. I used the en_core_web_md to create the knowledge base since I do not have a pre-trained model.
I did not write any custom code just trying to run this example, Can someone point me to the right direction.
This was asked and answered in the following issue on spaCy's GitHub.
It looks like the script no longer worked after a refactor of the entity linking pipeline as it now expects either a statistical or rule-based NER component in the pipeline.
The new script adds such an EntityRuler
to the pipeline as an example. I.e.,
# Add a custom component to recognize "Russ Cochran" as an entity for the example training data.
# Note that in a realistic application, an actual NER algorithm should be used instead.
ruler = EntityRuler(nlp)
patterns = [{"label": "PERSON", "pattern": [{"LOWER": "russ"}, {"LOWER": "cochran"}]}]
ruler.add_patterns(patterns)
nlp.add_pipe(ruler)
However, this can be replaced with your own statistical NER model.