pythonnlpword-embedding

How to Load the pre-trained word embeddings in .npy files


I'm trying to use the word embeddings pre-trained in HistWords Project by the Stanford NLP team. But when I ran the document example.py from the GitHub website, there was an error: ModuleNotFoundError: No module named 'representations.sequentialembedding'. How can I solve this problem?

I've installed the "representations" module, but it doesn't work. The pre-trained word embeddings are of ".npy" format, is there any Python-based method for uploading them?


Solution

  • To set it up you can do the following in your shell:

    cd <path you want to store your project>
    git clone https://github.com/williamleif/histwords.git
    cd histwords
    
    # ----- Set up Python2.7 -----
    ## Python2.7 via conda is quite easy
    conda create -y -n <env_name> python=2.7
    conda activate <env_name>
    
    ## Else install/locate a python 2 version
    ### Check python -V (maybe its python2.7)
    ### Maybe you have python2 or python2.7 executables
    ### On linux you could look for ls /usr/bin/python* 
    python2 -m venv <env_name>
    source <env_name>/bin/activate # if you use linux
    bin/Scripts/activate # on Windows
    
    # ---- Now with Python2.7 ----
    pip install -r requirements.txt
    
    # ---- Download & Move your embedding from https://nlp.stanford.edu/projects/histwords/
    # to embeddings/<category> subfolders
    
    python examples.py
    # Outputs the similarities