I am using Huggingface Trainer to train a cumstom model subclassing a Llama llm. After tokenized by the tokenizer, my dataset has these fields 'input_ids
', 'labels
' and so on, and I additionally add 2 custom colunms 'interact_ids
' and 'candidate_ids
'. But i can't get these custom fields in the forward() function of my Model 'class LLMWithCustomLayer(LlamaForCausalLM)
'.
def forward(
self,
input_ids: torch.LongTensor = None,
attention_mask: Optional[torch.Tensor] = None,
position_ids: Optional[torch.LongTensor] = None,
past_key_values: Optional[List[torch.FloatTensor]] = None,
inputs_embeds: Optional[torch.FloatTensor] = None,
labels: Optional[torch.LongTensor] = None,
use_cache: Optional[bool] = None,
output_attentions: Optional[bool] = None,
output_hidden_states: Optional[bool] = None,
return_dict: Optional[bool] = None,
interact_ids = None,
candidate_ids = None,
):
print('interact_ids, candidate_ids', interact_ids, candidate_ids) # they are none
interact_embs = []
candidate_embs = []
for i in range(interact_ids.shape(0)):
# O_i = F_i (e_i)
interact_embs.append(self.item_emb_proj(self.get_item_emb(interact_ids)))
# O_i = F_i (e_i)
candidate_embs.append(self.item_emb_proj(self.get_item_emb(candidate_ids)))
# replace [CandidateEmb] and [HistoryEmb]
inputs_embeds = self.replace_hist_candi_token(input_ids, inputs_embeds ,interact_embs, candidate_embs)
return super().forward(
input_ids=input_ids,
attention_mask=attention_mask,
position_ids=position_ids,
past_key_values=past_key_values,
inputs_embeds=inputs_embeds,
use_cache=use_cache,
output_attentions=output_attentions,
output_hidden_states=output_hidden_states,
return_dict=return_dict,
labels = labels
)
I an new in LLM fine tuning. Can anyone help me? I would be grateful so much.
You need to modify the data collator to pass interact_ids
and candidate_ids
to your model, as Trainer ignores extra columns by default.
To modify the data collator
class CustomDataCollator(DataCollatorWithPadding):
def __call__(self, features):
batch = super().__call__(features)
batch["interact_ids"] = torch.tensor([f["interact_ids"] for f in features])
batch["candidate_ids"] = torch.tensor([f["candidate_ids"] for f in features])
return batch
then pass it to Trainer
trainer = Trainer(
model=LLMWithCustomLayer.from_pretrained("your-llama-model"),
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
data_collator=CustomDataCollator(tokenizer)
)
Now, your forward()
method will receive interact_ids
and candidate_ids
.
Hope, it will work!