[SOLVED] How do I get word indexes for Glove embeddings in pytorch

How do I get word indexes for Glove embeddings in pytorch

I am trying to use glove embeddings in pytorch to use in a model. I have the following code:

from torchtext.vocab import GloVe
import torch.nn
glove= GloVe()
my_embeddings = torch.nn.Embedding.from_pretrained(glove.vectors,freeze=True)

However, I don't understand how I can get the embeddings for a specific word from this. my_embeddings only take a pytorch index rather than text. I can just use:

from torchtext.data import get_tokenizer
tokenizer = get_tokenizer("basic_english")
glove.get_vecs_by_tokens(tokenizer("Hello, How are you?"))

But then I am confused why I need to use torch.nn.Embedding at all as most tutorials suggest I do?

Solution

So I believe this is done using glove.stoi:

sentence = "Hello, How are you?"
tokenized_sentence = tokenizer(sentence)
torch_tensor_first_word = torch.tensor(glove.stoi[tokenized_sentence[0]], dtype=torch.long)
embeddings_for_first_word = my_embeddings(torch_tensor_first_word)