I am supposed to do some exercises with python glove, most of it doesn't give me any problems but now i am supposed to find the 5 most similar words to "norway - war + peace" from the "glove-wiki-gigaword-100" package. But when i run my code it just says that the 'word' is not in the vocabulary. Now I'm guessing that this is some kind of formatting, but i don't know how to use it.
import gensim.downloader as api
model = api.load("glove-wiki-gigaword-100") # download the model and return as object ready for use
bests = model.most_similar("norway - war + peace", topn= 5)
print("5 most similar words to 'norway - war + peace':")
for best in bests:
print(best)
Gensim's model word2vec only deals with previously seen words. Here you give an entire sentence... What you want to do is:
To do so, you will need these functions: model.wv.most_similar()
and model.wv.similar_by_vector()
. Note that model.wv.most_similar()
does something similar to these three steps but in a more complicated way using a set of positive words and a set of negative words. See the documentation for details.