nlpspacyspacy-3

Not sure why my Python code that uses Spacy to add a phone_number entity is not working


The pattern works with matcher. But not as an entity? Here is my code:

import spacy
from spacy.pipeline import EntityRuler

nlp = spacy.load("en_core_web_sm")

patterns = [
    {
        "label": "PHONE_NUMBER",
        "pattern": [
            {"ORTH": "("},
            {"SHAPE": "ddd"},
            {"ORTH": ")"},
            {"IS_SPACE": True, "OP": "?"},
            {"SHAPE": "ddd"},
            {"ORTH": "-"},
            {"SHAPE": "dddd"},
        ],
    }
]

entity_ruler = EntityRuler(nlp, patterns=patterns, overwrite_ents=True)
nlp.add_pipe("entity_ruler", before="ner")

doc = nlp("You can reach me at (111) 111-1111.")

for ent in doc.ents:
    print(ent.text, ent.label_)

This returns:

111 CARDINAL
111 CARDINAL

Advice/help needed and appreciated. Thank you.


Solution

  • The problem was two seperate components—one constructed with the class EntityRuler and one constructed with nlp.add_pipe. The component created with the add_pipe method wasn't aware of your patterns. Using just one method and then adding the patterns to that component did the trick.

    import spacy
    
    nlp = spacy.load("en_core_web_sm")
    patterns = [
        {
            "label": "PHONE_NUMBER",
            "pattern": [
                {"ORTH": "("},
                {"SHAPE": "ddd"},
                {"ORTH": ")"},
                {"IS_SPACE": True, "OP": "?"},
                {"SHAPE": "ddd"},
                {"ORTH": "-"},
                {"SHAPE": "dddd"},
            ],
        }
    ]
    
    ruler = nlp.add_pipe("entity_ruler", before="ner")
    ruler.add_patterns(patterns)
    
    doc = nlp("You can reach me at (111) 111-1111.")
    
    for ent in doc.ents:
        print(ent.text, ent.label_)
    
    (111) 111-1111 PHONE_NUMBER
    

    I read about the different ways to initialize the component here: https://spacy.io/api/entityruler