pythongoogle-cloud-nlnamed-entity-extractiongoogle-natural-language

begin_offset is set to -1 Google NATURAL LANGUAGE API (entity_extraction)


Google Cloud CLOUD NATURAL LANGUAGE API (entity_extraction) returns -1 for begin_offset(both on nodejs and python). Am I missing any paramters

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types

client = language.LanguageServiceClient()

text = u'Dr. James went to NYU yesterday'
document = types.Document(
    content=text,
    type=enums.Document.Type.PLAIN_TEXT)

results = client.analyze_entities(document=document).entities
print(results[0].mentions[0].text.begin_offset)

Solution

  • Pass in an EncodingType. here is an example: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/c359be8e635806f4c4986e6c643c67bac5e857da/language/cloud-client/v1/snippets.py#L208