pythonmachine-learningoptimizationnlpfasttext

Reduce fastText memory usage for big models


I trained a machine learning sentence classification model that uses, among other features, also the vectors obtained from a pretrained fastText model (like these) which is 7Gb. I use the pretrained fastText Italian model: I am using this word embedding only to get some semantic features to feed into the effective ML model.

I built a simple API based on fastText that, at prediction time, computes the vectors needed by the effective ML model. Under the hood, this API receives a string as input and calls get_sentence_vector. When the API starts, it loads the fastText model into memory.

How can I reduce the memory footprint of fastText, which is loaded into RAM?

Constraints:

At the moment, I'm starting to experiment with compress-fasttext...

Please share your suggestions and thoughts even if they do not represent full-fledged solutions.


Solution

  • There is no easy solution for my specific problem: if you are using a fastText embedding as a feature extractor, and then you want to use a compressed version of this embedding, you have to retrain the final classifier, since produced vectors are somewhat different.

    Anyway, I want to give a general answer for

    fastText models reduction

    enter image description here

    Unsupervised models (=embeddings)

    You are using pretrained embeddings provided by Facebook or you trained your embeddings in an unsupervised fashion. Format .bin. Now you want to reduce model size/memory consumption.

    Straight-forward solutions:

    If you have training data and can perform retraining, you can use floret, a fastText fork by explosion (the company of Spacy), that uses a more compact representation for vectors.

    If you are not interested in fastText ability to represent out-of-vocabulary words (words not seen during training), you can use .vec file (containing only vectors and not model weights) and select only a portion of the most common vectors (eg the first 200k words/vectors). If you need a way to convert .bin to .vec, read this answer. Note: gensim package fully supports fastText embedding (unsupervised mode), so these operations can be done through this library (more details in this answer)

    Supervised models

    You used fastText to train a classifier, producing a .bin model. Now you want to reduce classifier size/memory consumption.