[SOLVED] How to get back incorrect NER predictions in sklearn-crfsuite

How to get back incorrect NER predictions in sklearn-crfsuite

I am performing NER using sklearn-crfsuite. I am trying to report back on an entity mention by entity mention case as a true positive (both prediction and expected correct even if no entity), false positive (prediction says yes, expected no) or false negative (prediction says no, expected yes).

I cannot see how to get anything other than tag/token based summary statistics for NER performance.

I would be OK with a different way of grouping entity mentions such as: correct, incorrect, partial, missing, spurious. I can write a whole bunch of code around it myself to try to accomplish this (and might have to), but there has to be a single call to get this info?

Here are some of the calls that are being made to get the summary statistics:

from sklearn import metrics
report = metrics.classification_report(targets, predictions,
                                       output_dict=output_dict)
precision = metrics.precision_score(targets, predictions,
                                    average='weighted')
f1 = metrics.f1_score(targets, predictions, average='weighted')
accuracy = metrics.accuracy_score(targets, predictions)

Solution

It's not so straightforward to get the metrics you mentioned (i.e., correct, incorrect, partial, missing, spurious) which I believe are the same ones as SemEval'13 challenge introduced.

I also needed to report some results based on these metrics and ended up coding it myself:

detailed explanation of these metrics
my own code implementation (it's really too much for a SO post)

I'm working together with someone else and we are planning to release that as package that can be easily integrated with open-source NER systems and/or read standard formats like CoNLL. Feel free to join and help us out :)