pythontensorflowhuggingface-transformersbert-language-modelonnx

ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name


Goal: Amend this Notebook to work with Albert and Distilbert models

Kernel: conda_pytorch_p36. I did Restart & Run All, and refreshed file view in working directory.

Error occurs in Section 1.2, only for these 2 new models.

For filenames etc., I've created a variable used everywhere:

MODEL_NAME = 'albert-base-v2'  # 'distilbert-base-uncased', 'bert-base-uncased'

I replaced imports with:

from transformers import (AutoConfig, AutoModel, AutoTokenizer)
#from transformers import (BertConfig, BertForSequenceClassification, BertTokenizer,)

As suggested in Transformers Documentation - Auto Classes.

Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture.


Section 1.2:

# load model
model = AutoModel.from_pretrained(configs.output_dir)  # BertForSequenceClassification
model.to(configs.device)


# quantize model
quantized_model = torch.quantization.quantize_dynamic(
    model, {torch.nn.Linear}, dtype=torch.qint8
)

#print(quantized_model)

def print_size_of_model(model):
    torch.save(model.state_dict(), "temp.p")
    print('Size (MB):', os.path.getsize("temp.p")/(1024*1024))
    os.remove('temp.p')

print_size_of_model(model)
print_size_of_model(quantized_model)

Traceback:

ValueError: Unrecognized model in ./MRPC/. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: imagegpt, qdqbert, vision-encoder-decoder, trocr, fnet, segformer, vision-text-dual-encoder, perceiver, gptj, layoutlmv2, beit, rembert, visual_bert, canine, roformer, clip, bigbird_pegasus, deit, luke, detr, gpt_neo, big_bird, speech_to_text_2, speech_to_text, vit, wav2vec2, m2m_100, convbert, led, blenderbot-small, retribert, ibert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, megatron-bert, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta-v2, deberta, flaubert, fsmt, squeezebert, hubert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, speech-encoder-decoder, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas, splinter, sew-d, sew, unispeech-sat, unispeech, wavlm

Please let me know if there's anything else I can add to post.


Solution

  • Explanation:

    When instantiating AutoModel, you must specify a model_type parameter in ./MRPC/config.json file (downloaded during Notebook runtime).

    List of model_types can be found here.


    Solution:

    Code that appends model_type to config.json, in the same format:

    import json
    
    json_filename = './MRPC/config.json'
    
    with open(json_filename) as json_file:
        json_decoded = json.load(json_file)
    
    json_decoded['model_type'] = # !!
    
    with open(json_filename, 'w') as json_file:
        json.dump(json_decoded, json_file, indent=2, separators=(',', ': '))
    

    config.json:

    {
      "attention_probs_dropout_prob": 0.1,
      "finetuning_task": "mrpc",
      "hidden_act": "gelu",
      "hidden_dropout_prob": 0.1,
      "hidden_size": 768,
      "initializer_range": 0.02,
      "intermediate_size": 3072,
      "layer_norm_eps": 1e-12,
      "max_position_embeddings": 512,
      "num_attention_heads": 12,
      "num_hidden_layers": 12,
      "num_labels": 2,
      "output_attentions": false,
      "output_hidden_states": false,
      "pruned_heads": {},
      "torchscript": false,
      "type_vocab_size": 2,
      "vocab_size": 30522,
      "model_type": "albert"
    }