pythonbioinformaticshuggingface-transformersfasta

cannot import name 'ESMForMaskedLM' from 'transformers' on Google colab


I am fine tuning the ESM facebook transformer with a fasta file of sequences. However, I get ImportError: cannot import name 'ESMForMaskedLM' from 'transformers'when running the cell, I have been following: the hugging face model but I haven't managed to make the import work, I am using Google Colab. Help is much appreciated:

the code:

!pip install transformers
from transformers import ESMForMaskedLM, ESMTokenizer, pipeline
tokenizer = ESMTokenizer.from_pretrained("facebook/esm-1b", do_lower_case=False)
model = ESMForMaskedLM.from_pretrained("facebook/esm-1b")
unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
unmasker('QERLKSIVRILE<mask>SLGYNIVAT')

Solution

  • Esm is now added to hugging face, use this:

    from transformers import AutoTokenizer, EsmForMaskedLM, pipeline
    
    tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t6_8M_UR50D")
    model = EsmForMaskedLM.from_pretrained("facebook/esm2_t6_8M_UR50D")
    
    unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
    g = unmasker('MQIFVKTLTGKTITLEVEPS<mask>TIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG')