I am trying to use glove embeddings in pytorch to use in a model. I have the following code:
from torchtext.vocab import GloVe
import torch.nn
glove= GloVe()
my_embeddings = torch.nn.Embedding.from_pretrained(glove.vectors,freeze=True)
However, I don't understand how I can get the embeddings for a specific word from this. my_embeddings
only take a pytorch index rather than text. I can just use:
from torchtext.data import get_tokenizer
tokenizer = get_tokenizer("basic_english")
glove.get_vecs_by_tokens(tokenizer("Hello, How are you?"))
But then I am confused why I need to use torch.nn.Embedding
at all as most tutorials suggest I do?
So I believe this is done using glove.stoi
:
sentence = "Hello, How are you?"
tokenized_sentence = tokenizer(sentence)
torch_tensor_first_word = torch.tensor(glove.stoi[tokenized_sentence[0]], dtype=torch.long)
embeddings_for_first_word = my_embeddings(torch_tensor_first_word)