ontologydbpedialinked-dataspotlight-dbpedia

What do the parameters of DBpedia Spotlight mean?


I am interested in using DBpedia Spotlight. However, we need to insert a value to the two parameters confidence and support. What do these two parameters really mean?

I want to identify the significant, prominent n-grams in the text. In that case, what is the usual recommendation for confidence and support parameters (rule of thumb)?


Solution

  • When you ask DBpedia Spotlight to annotate text (finding entities/topics), it searches for n-grams that have URIs in DBpedia (n-grams that are Wikipedia titles). Those n-grams are called DBpedia resources.

    Support: this is the Resource Prominence parameter, it helps you to ignore unimportant or uninformative resources. When you set a value X to it, this means resources that have a number of Wikipedia in-links smaller than X will be ignored and not returned to you.

    Confidence: this is the Disambiguation Confidence parameter, it is a threshold which takes a value between 0 and 1. When you set a high value to it, you get better and more trustworthy annotations but you risk losing some correct ones.

    Choosing values of those (or any other) parameters depends on your use case.

    Examples: