When training llama3.1-8B-Instruct model on Amazon sagemaker, the training job fails with the following output:
./usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download`
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Traceback (most recent call last):
File "/workspace/train.py", line 85, in <module>
main()
File "/workspace/train.py", line 48, in main
config = AutoConfig.from_pretrained(model_name, token=use_auth_token)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1124, in from_pretrained
return config_class.from_dict(config_dict, **unused_kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 764, in from_dict
config = cls(**config_dict)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/configuration_llama.py", line 160, in __init__
self._rope_scaling_validation()
File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/configuration_llama.py", line 180, in _rope_scaling_validation
raise ValueError(
ValueError: `rope_scaling` must be a dictionary with with two fields, `type` and `factor`, got {'factor': 8.0, 'low_freq_factor': 1.0, 'high_freq_factor': 4.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}
I've tried changing the config.rope_scaling and applying it to the model but it doesn't work. This is the code snippet where I change the config:
# Load model configuration
config = AutoConfig.from_pretrained(model_name, token=use_auth_token)
# Modify the rope_scaling config
config.rope_scaling = {
"type": "llama3",
"factor": 8.0
}
# Initialize the model with the modified config
model = LlamaForCausalLM.from_pretrained(modek_name, token=use_auth_token, config=config)
I just faced the same error using Llama 3.1 with transformers-4.41.0
. It was resolved by upgrading:
pip install --upgrade transformers
With transformers-4.44.2
everything runs perfectly. See the related GitHub issue for more details.