I wanted to use word embeddings for the embedding Layer in my neural network using pre-trained vectors from GLOVE. Do I need to restrict the vocabulary to the training-set when constructing the word2index dictionary? Wouldn't that lead to a limited non-generalizable model? Is considering all the vocabulary of GLOVE a recommended practice?
Yes, it is better to restrict your vocab size. Because pre-trained embeddings (like GLOVE) have many words in them that are not very useful (and so Word2Vec) and the bigger vocab size the more RAM you need and other problems.
Select your tokens from all of your data. it won't lead to a limited non-generalizable model if your data is big enough. if you think that your data does not have as many tokens as are needed, then you should know 2 things:
I have an answer to show how you can select a minor set of word vectors from a pre-trained model in here