My requirement here is given a sentence(sequence), I would like to just extract the entities present in the sequence without classifying them to a type in the NER task. I see that BertForTokenClassification for NER does the classification. Can this be adapted for just the extraction?
Can BERT just be used to do entity extraction/identification?
Regardless BERT, NER tagging is usually done by tagging with the IOB format (inside, outside, beginning) or something similar (often the end is also explicitly tagged). The inside and beggining tags contain the entity type. Something like this:
Alex B-PER
is O
going O
to O
Los B-LOC
Angeles I-LOC
If you modify your training data, such that there will be only one entity type, the model will only learn to detect the entities without knowing what type the entity is.
Alex B
is O
going O
to O
Los B
Angeles I