machine-learningnlpdata-annotationsspacynamed-entity-recognition

Data Annotation for Machine Learning


I am going to develop a machine learning model. I have large data sets(Text). I need overall better accuracy F1 score etc. I am using data annotation tools(Dataturks). Which approach will be good to label the data as single label per entity or multiple label per entity (like there has been 5 times GUI so we have to label it 1 time or 5 times for better overall score). Your help will be highly appreciated.


Solution

  • If you have any duplicate examples where all the features are identical you need to remove them