I have learned to use AMP and GA tricks on training model from https://medium.com/ai2-blog/tutorial-training-on-larger-batches-with-less-memory-in-allennlp-1cd2047d92ad, But it seems not supported in the 2.4.0 ?
File "/root/anaconda3/envs/allennlp/lib/python3.6/site-packages/allennlp/training/util.py", line 217, in create_serialization_dir f"Value for '{key}' in training configuration does not match that the value in "
Thank you for your answer! @Dirk Groeneveld. finnally,the correct way to use AMP with allennlp 2.4.0 :
"trainer": {
"type":"gradient_descent",
"use_amp": true,
"num_gradient_accumulation_steps": 4,
"distributed": true,
...
}