pythonnlpspacy

spaCy incorrectly recognizing finger as verb


I am trying to investigate a way to fix (or alter) how spaCy identifies verbs/nouns. In the following example I would like to recognize finger as a NOUN not a VERB.

import spacy
nlp = spacy.load("en_core_web_lg")
doc = nlp('over exertion to finger from pulling open a stuck door left middle finger strain')
for w in doc:
    print(w.text, w.lemma_, w.pos_)

which returns

over over ADP
exertion exertion NOUN
to to PART
finger finger VERB  <-- finger should be NOUN
from from ADP
pulling pull VERB
open open ADJ
a a DET
stuck stuck ADJ
door door NOUN
left leave VERB
middle middle ADJ
finger finger NOUN
strain strain NOUN

What changes could I make to solve this issue?


Solution

  • Use a better, en_core_web_trf, model:

    >>> import spacy
    >>> nlp = spacy.load("en_core_web_trf")
    >>> doc = nlp('over exertion to finger from pulling open a stuck door left middle finger strain')
    >>> for w in doc:
        print(w.text, w.lemma_, w.pos_)
        
    over over ADP
    exertion exertion NOUN
    to to ADP
    finger finger NOUN
    from from ADP
    pulling pull VERB
    open open ADP
    a a DET
    stuck stick VERB
    door door NOUN
    left leave VERB
    middle middle ADJ
    finger finger NOUN
    strain strain NOUN