I want to use Llama-3.2-1B-Instruct
model, and although I have set "temperature": 0.0, "top_p":0.0 and "top_k":0
, it still generates inconsistent output. This is how my pipeline looks like:
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="mps",
model_kwargs={"temperature": 0.0,
"do_sample":True,
"top_p":0.0,
"top_k":0,},
)
Any idea how to solve this issue?
The model inconsistent output can be due to two main factors:
1. Temperature:
setting temperature to zero give more inconsistent result. You can refer Opeani discussion page for detail.
So the best option is to set temperature to very low values such as 0.00001 instead of zero.
2. do_sample
You already set it false, and it should remain that way only.