machine-learningdeep-learningnlphuggingface-transformers

Alternative to device_map = "auto" in Huggingface Pretrained


I have a model that I was reading from huggingface using the following code:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)

Now I read the model and I did some modifications to the internal layers and added more layers. When I started the training/fine-tuning I get that not everything is on the same model.

Now after more investigations, I found that my custom layers aren't distributed on multi GPUs as the original model. So I need something like device_map="auto" but after reading the model.

So simply something like

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)

model.device_map = "auto"

Solution

  • I found out that there are actually several methods in accelerate for this. The first one is used to analyze your model and calculate the total amount of available memory that will be occupied by the model:

    https://huggingface.co/docs/accelerate/en/package_reference/big_modeling#accelerate.infer_auto_device_map

    The second one is used to match your model with the devices:

    https://huggingface.co/docs/accelerate/en/package_reference/big_modeling#accelerate.dispatch_model

    So basically, in your case, you can use the following code:

    from accelerate import dispatch_model, infer_auto_device_map
    
    model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)
    
    ***
    ...
    new_model = CustomModel(model)
    ...
    ***
    
    device_map_dict = infer_auto_device_map(new_model)
    dispatch_model(new_model, device_map_dict)
    

    P.S. This code still needs to be tested on fine-tuning.