gensimfasttext

cannot load fine tuned fasttext wiki model after retraining and saving


I am fine tuning a fastest wiki model as follows. This works fine. After fine tuning I save the retrained model.

from gensim.models import fasttext
model = fasttext.load_facebook_model(datapath("wiki/wiki.en.bin"))
model.build_vocab(sentences, update=True)
model.train(sentences, total_examples=len(sentences), epochs=5)
model.save("wiki/wiki.en.updated.bin")

Later on when I try and load the model

model = fasttext.load_facebook_model(datapath("wiki/wiki.en.updated.bin")

I get an error

NotImplementedError: Supervised fastText models are not supported

Which is strange since I am not doing supervised training!?!? Any idea why this is happening and how to correctly load the fine tuned model?

Is impossible I am saving or loading the file wrong. I did notice that more than one file is created upon save, but that's just normal due to the underlying representation, or no?

I am running macOS 14.3 on an M2 Mac.

-rw-r--r--@ 1 sail0r  staff  8493673445 Oct 19  2017 wiki.en.bin
-rw-r--r--  1 sail0r  staff    84876251 Apr 14 02:12 wiki.en.updated.bin
-rw-r--r--  1 sail0r  staff  3023571728 Apr 14 02:12 wiki.en.updated.bin.syn1neg.npy
-rw-r--r--  1 sail0r  staff  2400000128 Apr 14 02:11 wiki.en.updated.bin.wv.vectors_ngrams.npy
-rw-r--r--  1 sail0r  staff  3023571728 Apr 14 02:11 wiki.en.updated.bin.wv.vectors_vocab.npy
-rw-rw-r--@ 1 sail0r  staff  6597238061 Sep 19  2016 wiki.en.vec

This is gensim 4.3.2.


Solution

  • This change to the code above fixes the problem.

    from gensim.models import FastText, fasttext
    model = FastText.load(datapath("wiki/wiki.en.bin"))
    

    It seems that the retrained model is not saved in fasttext format and so just requires the load() method, otherwise apparently it gets confused.