I want to tag the parts of speech of a sentence. For this task I am using pos-english-fast model. If there was one sentence the model identified the tags for the pos. I created a data file where I kept all my sentences. The name of the data file is 'data1.txt'. Now if I try to tag the sentences on the data file it does not work.
My code
from flair.models import SequenceTagger
model = SequenceTagger.load("flair/pos-english")
#Read the data from the data.txt
with open('data1.txt') as f:
data = f.read().splitlines()
#Create a list of sentences from the data
sentences = [sentence.split() for sentence in data]
#Tag each sentence using the model
tagged_sentences = []
for sentence in sentences:
tagged_sentences.append(model.predict(sentence))
for sentence in tagged_sentences:
print(sentence)
The error I received
AttributeError Traceback (most recent call last)
<ipython-input-16-03268ee0d9c9> in <cell line: 10>()
9 tagged_sentences = []
10 for sentence in sentences:
---> 11 tagged_sentences.append(model.predict(sentence))
12 for sentence in tagged_sentences:
13 print(sentence)
1 frames
/usr/local/lib/python3.10/dist-packages/flair/data.py in set_context_for_sentences(cls, sentences)
1116 previous_sentence = None
1117 for sentence in sentences:
-> 1118 if sentence.is_context_set():
1119 continue
1120 sentence._previous_sentence = previous_sentence
AttributeError: 'str' object has no attribute 'is_context_set'
How could I resolve it?
Let's say this is your data:
['Not My Responsibility is a 2020 American short film written and produced by singer-songwriter Billie Eilish.',
"A commentary on body shaming and double standards placed upon young women's appearances, it features a monologue from Eilish about the media scrutiny surrounding her body.",
'The film is spoken-word and stars Eilish in a dark room, where she gradually undresses before submerging herself in a black substance.']
This is what you need to do to do part-of-speech tagging in Flair:
from flair.data import Sentence
from flair.models import SequenceTagger
sentences = list(map(Sentence, data))
_ = model.predict(sentences)
Now all sentences are correctly tagged. If you want to visualize, for example, the tags for the first sentence, just use print(sentences[0])
. This is the output:
Sentence[17]: "Not My Responsibility is a 2020 American short film written and produced by singer-songwriter Billie Eilish." →
["Not"/RB, "My"/PRP$, "Responsibility"/NN, "is"/VBZ, "a"/DT, "2020"/CD, "American"/JJ, "short"/JJ, "film"/NN, "written"/VBN, "and"/CC, "produced"/VBN, "by"/IN, "singer-songwriter"/NN, "Billie"/NNP, "Eilish"/NNP, "."/.]
``