I am trying to process tweets stored in a text file. My code reads the tweets(one by one), processes them and then save the results of Watson in a csv file. The speed is around 28 tweets per minute only. Is data file handling causing this delay?
while 1:
where = file.tell()
line = file.readline()
if not line:
print "no line found, waiting for a 1 seconds"
time.sleep(1)
file.seek(where)
else:
if (re.search('[a-zA-Z]', line)):
print "-----------------------------"
print "the line is: "
print line
print "-----------------------------"
response = natural_language_understanding.analyze(
text=line,
features=Features(
entities=EntitiesOptions(
emotion=True,
sentiment=True,
limit=2),
keywords=KeywordsOptions(
emotion=True,
sentiment=True,
limit=2)),
language='en'
)
response["tweet"] = line
print(json.dumps(response, indent=2))
with open('#DDvKXIP.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
for key, value in response.items():
writer.writerow([key, value])
else:
print "--------------------------------"
print "found a line without any alphabet in it, hence not considering."
print line
print "--------------------------------"
The short answer is you should put timing markers between main parts of your code to determine what is the slowest.
Other options to improve the speed.
If you are using the lite version, you want to be sure you don't rate limit yourself.