spacynamed-entity-recognition

How to use multiple NER pipes with the same spaCy nlp object?


I trained a custom NER model for specific entity types (say, DRUGS) that are different from those that come out of the box in the standard spaCy models (ORG, PERSON, etc.). Is it possible to add the NER pipe from this custom model to another model that already contains the standard spaCy NER pipe? I tried the following:

import spacy

custom_nlp = spacy.load('my_trained_model/model-best/') #trained with the GPU option from the spaCy Quickstart page
doc = custom_nlp('Chantix is a drug')
print(doc.ents) # Prints Chantix as a DRUG, as expected

main_nlp = spacy.load('en_core_web_trf')
doc = custom_nlp('Chantix is a drug')
print(doc.ents) # Prints Chantix as a PRODUCT, as expected

main_nlp.add_pipe('ner', source=custom_nlp, name='custom_ner', before='ner')
print(main_nlp.pipe_names) # Both custom_ner and ner are there in the expected order
doc = main_nlp('Chantix is a drug')
print(doc.ents) # Here hell breaks loose, basically any token becomes an entity

Solution

  • The problem is that your added custom_ner is listening to the transformer component from en_core_web_trf rather than the one from the custom_nlp pipeline, so it's not getting the right input and is producing nonsense.

    You need to "replace the listeners" before you add the component to en_core_web_trf:

    custom_nlp.replace_listeners("transformer", "ner", ["model.tok2vec"])
    main_nlp.add_pipe('ner', source=custom_nlp, name='custom_ner', before='ner')
    

    Docs: https://spacy.io/api/language/#replace_listeners