pytorchnlphuggingface-transformerslarge-language-modelpeft

How to fix error `OSError: <model> does not appear to have a file named config.json.` when loading custom fine-tuned model?


Preface

I am new to implementing the NLP model. I have successfully fine-tuned LLaMA 3-8B variants with QLORA and uploaded them to HuggingFace.

The directories are filled with these files:

-  .gitattributes
- adapter_config.json
- adapter_model.safetensors
- special_tokens_map.json
- tokenizer.json
- tokenizer_config.json
- training_args.bin

Implementation

  1. I am trying to load this model through this:
model_id_1 = "ferguso/llama-8b-pcl-v3"

tokenizer_1 = AutoTokenizer.from_pretrained(model_id_1)

quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,
)

model_1 = AutoModelForCausalLM.from_pretrained(
    model_id_1,
    quantization_config=quantization_config,
)

But it shows the error OSError: ferguso/llama-8b-pcl-v3 does not appear to have a file named config.json. Checkout 'https://huggingface.co/ferguso/llama-8b-pcl-v3/tree/main' for available files.

  1. So then I am trying to load the config.json from the original model which is meta-llama/Meta-Llama-3-8B:
original_model = "meta-llama/Meta-Llama-3-8B"
model_id_1 = "ferguso/llama-8b-pcl-v3"

tokenizer_1 = AutoTokenizer.from_pretrained(model_id_1)

quantization_config = BitsAndBytesConfig(
    load_in_8bit=True,
)

original_config = AutoConfig.from_pretrained(original_model)
original_config.save_pretrained(model_id_1)

model_1 = AutoModelForCausalLM.from_pretrained(
    model_id_1,
    quantization_config=quantization_config,
    config = original_config
)

But still, it shows another error OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ferguso/llama-8b-pcl-v3.

Questions

How to load the fine-tuned model properly?


Solution

  • Your directory contains only the files of the peft-adapter and the files required to load the tokenizer, but the base model weights are missing. I assume you have used the save_pretrained method from peft. This method only saves the adapter weights and config (I use a smaller model for my answer and a different task type!):

    from peft import LoraConfig, TaskType, get_peft_model, PeftModel
    from transformers import AutoModelForTokenClassification
    from pathlib import Path
    
    # ferguso/llama-8b-pcl-v3 in your case 
    adapter_path = 'bla'
    # meta-llama/Meta-Llama-3-8B in your case
    base_model_id = "distilbert/distilbert-base-uncased"
    
    peft_config = LoraConfig(task_type=TaskType.TOKEN_CLS, target_modules="all-linear")
    
    # AutoModelForCausalLM in your case
    model = AutoModelForTokenClassification.from_pretrained(base_model_id)
    model = get_peft_model(model, peft_config)
    
    model.save_pretrained(adapter_path)
    
    print(*list(Path(adapter_path).iterdir()), sep='\n')
    

    Output:

    bla/adapter_config.json
    bla/README.md
    bla/adapter_model.safetensors
    

    To load your pretrained model successfully, you need to load this base_model weights as well and use the peft model class to load the adapter:

    model = AutoModelForTokenClassification.from_pretrained(base_model_id)
    model = PeftModel.from_pretrained(model, adapter_path)
    

    You can also merge the adapter weights back with merge_and_unload and save it:

    model.merge_and_unload().save_pretrained('bla2')
    print(*list(Path('bla2').iterdir()), sep='\n')
    

    Output:

    bla2/config.json
    bla2/model.safetensors
    

    This way you will be able to load the model without peft and only transformers as you tried in the example code of your question.