I trained a custom NER model for specific entity types (say, DRUGS) that are different from those that come out of the box in the standard spaCy models (ORG, PERSON, etc.). Is it possible to add the NER pipe from this custom model to another model that already contains the standard spaCy NER pipe? I tried the following:
import spacy
custom_nlp = spacy.load('my_trained_model/model-best/') #trained with the GPU option from the spaCy Quickstart page
doc = custom_nlp('Chantix is a drug')
print(doc.ents) # Prints Chantix as a DRUG, as expected
main_nlp = spacy.load('en_core_web_trf')
doc = custom_nlp('Chantix is a drug')
print(doc.ents) # Prints Chantix as a PRODUCT, as expected
main_nlp.add_pipe('ner', source=custom_nlp, name='custom_ner', before='ner')
print(main_nlp.pipe_names) # Both custom_ner and ner are there in the expected order
doc = main_nlp('Chantix is a drug')
print(doc.ents) # Here hell breaks loose, basically any token becomes an entity
The problem is that your added custom_ner
is listening to the transformer
component from en_core_web_trf
rather than the one from the custom_nlp
pipeline, so it's not getting the right input and is producing nonsense.
You need to "replace the listeners" before you add the component to en_core_web_trf
:
custom_nlp.replace_listeners("transformer", "ner", ["model.tok2vec"])
main_nlp.add_pipe('ner', source=custom_nlp, name='custom_ner', before='ner')