jupyter-notebookgoogle-colaboratorygensimdoc2vec

'ConcatenatedDoc2Vec' object has no attribute 'docvecs'


I am a beginner in Machine Learning and trying Document Embedding for a university project. I work with Google Colab and Jupyter Notebook (via Anaconda). The problem is that my code is perfectly running in Google Colab but if i execute the same code in Jupyter Notebook (via Anaconda) I run into an error with the ConcatenatedDoc2Vec Object.

With this function I build the vector features for a Classifier (e.g. Logistic Regression).

def build_vectors(model, length, vector_size):
    vector = np.zeros((length, vector_size))
    for i in range(0, length):
        prefix = 'tag' + '_' + str(i)
        vector[i] = model.docvecs[prefix]
    return vector

I concatenate two Doc2Vec Models (d2v_dm, d2v_dbow), both are working perfectly trough the whole code and have no problems with the function build_vectors():

d2v_combined = ConcatenatedDoc2Vec([d2v_dm, d2v_dbow])

But if I run the function build_vectors() with the concatenated model:

#Compute combined Vector size
d2v_combined_vector_size = d2v_dm.vector_size + d2v_dbow.vector_size

d2v_combined_vec= build_vectors(d2v_combined, len(X_tagged), d2v_combined_vector_size)

I receive this error (but only if I run this in Jupyter Notebook (via Anaconda) -> no problem with this code in the Notebook in Google Colab):

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [20], in <cell line: 4>()
      1 #Compute combined Vector size
      2 d2v_combined_vector_size = d2v_dm.vector_size + d2v_dbow.vector_size
----> 4 d2v_combined_vec= build_vectors(d2v_combined, len(X_tagged), d2v_combined_vector_size)

Input In [11], in build_vectors(model, length, vector_size)
      3 for i in range(0, length):
      4     prefix = 'tag' + '_' + str(i)
----> 5     vector[i] = model.docvecs[prefix]
      6 return vector

AttributeError: 'ConcatenatedDoc2Vec' object has no attribute 'docvecs'

Since this is mysterious (for me) -> Working in Google Colab but not Anaconda and Juypter Notebook -> and I did not find anything to solve my problem in the web.


Solution

  • If it's working one place, but not the other, you're probably using different versions of the relevant libraries – in this case, gensim.

    Does the following show exactly the same version in both places?

    import gensim
    print(gensim.__version__)
    

    If not, the most immediate workaround would be to make the place where it doesn't work match the place that it does, by force-installing the same explicit version – pip intall gensim==VERSION (where VERSION is the target version) – then ensuring your notebook is restarted to see the change.

    Beware, though, that unless starting from a fresh environment, this could introduce other library-version mismatches!

    Other things to note: