I want to use mlflow.transformers.log_model()
to log a finetuned huggingface model.
However, when the mlflow.transformers.log_model
method is running, it simply does not finish - runs forever - throws no errors.
I suspect my configuration is not right, the model is too big?
The output says Skipping saving pretrained model weights to disk
so that should not be the problem.
Any ideas how to do this properly?
This is more or less how my setup looks like, you cannot run this, it includes some pseudocode...
I am on python 3.11.9 with transformers = "^4.41.2"
& mlflow = "^2.15.1"
.
import mlflow
import torch
from peft import LoraConfig
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
TrainingArguments,
)
from trl import SFTTrainer, setup_chat_format
train_dataset = ...
eval_dataset = ...
model_id = "LeoLM/leo-hessianai-7b-chat-bilingual"
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16,
quantization_config=bnb_config,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer_no_pad = AutoTokenizer.from_pretrained(model_id, add_bos_token=True)
model, tokenizer = setup_chat_format(model, tokenizer)
peft_config = LoraConfig(...)
args = TrainingArguments(...)
# Define Trainer
trainer = SFTTrainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
peft_config=peft_config,
tokenizer=tokenizer,
packing=True,
)
# mlflow
mlflow.set_experiment("my_experiment")
with mlflow.start_run() as run:
mlflow.transformers.autolog()
trainer.train()
components = {
"model": trainer.model,
"tokenizer": tokenizer_no_pad,
}
# !!! This function all does not finish... !!!
mlflow.transformers.log_model(
transformers_model=components,
artifact_path="model",
)
The last output I get in the console is:
INFO mlflow.transformers: Overriding save_pretrained to False for PEFT models, following the Transformers behavior. The PEFT adaptor and config will be saved, but the base model weights will not and reference to the HuggingFace Hub repository will be logged instead.
Unrecognized keys in `rope_scaling` for 'rope_type'='linear': {'type'}
/mypath/llm4pa-open-source/.venv/lib/python3.11/site-packages/peft/utils/save_and_load.py:209: UserWarning: Setting `save_embedding_layers` to `True` as the embedding layer has been resized during finetuning.
warnings.warn(
2024/08/12 18:21:14 INFO mlflow.transformers: Skipping saving pretrained model weights to disk as the save_pretrained is set to False. The reference to HuggingFace Hub repository LeoLM/leo-hessianai-7b-chat-bilingual will be logged instead.
/mypath/llm4pa-open-source/.venv/lib/python3.11/site-packages/_distutils_hack/__init__.py:26: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
Before defining the trainer, the model has be turned into a Peft model object via get_peft_model
, then the mlflow.transformers.log_model
works:
from peft import LoraConfig, get_peft_model
model = ...
peft_config = LoraConfig(...)
args = TrainingArguments(...)
peft_model = get_peft_model(model, peft_config)
trainer = SFTTrainer(
model=peft_model,
args=args,
...
)
# mlflow
mlflow.set_experiment("my_experiment")
with mlflow.start_run() as run:
mlflow.transformers.autolog()
trainer.train()
components = {
"model": trainer.model,
"tokenizer": tokenizer_no_pad,
}
# !!! Now the logginig of the model works, we can find it in the artifacts !!!
mlflow.transformers.log_model(
transformers_model=components,
artifact_path="model",
)