pythonwatson-nlu

What are the speed benchmarks in Watson-NLU?


I am trying to process tweets stored in a text file. My code reads the tweets(one by one), processes them and then save the results of Watson in a csv file. The speed is around 28 tweets per minute only. Is data file handling causing this delay?

while 1:
    where = file.tell()
    line = file.readline()
    if not line:
        print "no line found, waiting for a 1 seconds"
        time.sleep(1)
        file.seek(where)
    else:
        if (re.search('[a-zA-Z]', line)):
            print "-----------------------------"
            print "the line is: "
            print line
            print "-----------------------------"
            response = natural_language_understanding.analyze(
                text=line,
                features=Features(
                    entities=EntitiesOptions(
                        emotion=True,
                        sentiment=True,
                        limit=2),
                    keywords=KeywordsOptions(
                        emotion=True,
                        sentiment=True,
                        limit=2)),
                language='en'
                )
            response["tweet"] = line
            print(json.dumps(response, indent=2))
            with open('#DDvKXIP.csv', 'a') as csv_file:
                writer = csv.writer(csv_file)
                for key, value in response.items():
                    writer.writerow([key, value])
        else:
            print "--------------------------------"
            print "found a line without any alphabet in it, hence not considering."
            print line
            print "--------------------------------"

Solution

  • The short answer is you should put timing markers between main parts of your code to determine what is the slowest.

    Other options to improve the speed.

    1. You can create a threaded application that sends say 10-20 calls at a time. That should increase your rate to 280-560 tweets per minute.

    If you are using the lite version, you want to be sure you don't rate limit yourself.

    1. You can batch up tweets from the same user, and send as a large block. Instead of individual calls. If you are just trying to capture overall sentiment for example, this might not help.