Not able to calculate the "eval_loss" when passed the validation set in the trainer class

#7
by dutta18 - opened

I guess the model doesn't return the loss in the forward pass, thats why not able to calculate the "eval_loss" with a val set during the finetuning process. I used the official finetuning notebook provided to finetune the model.

Here is my implementation I tried with a toy dataset:

training_args = TrainingArguments(
num_train_epochs=4,
per_device_train_batch_size=2,
gradient_accumulation_steps=2,
warmup_steps=50,
learning_rate=1e-4,
weight_decay=0.01,
logging_steps=2,
save_strategy="steps",
save_steps=2,
save_total_limit=1,
optim="paged_adamw_8bit", # for 8-bit, keep this, else adamw_hf
bf16=True, # underlying precision for 8bit
output_dir=f"./{model_name}-vqav2",
report_to="tensorboard",
remove_unused_columns=False,
eval_strategy="steps", # Enables evaluation
eval_steps=2, # Frequency of evaluation
load_best_model_at_end=True, # Load the best model at the end
metric_for_best_model="eval_loss", # Monitor this metric
)

trainer = Trainer(
model=model,
args=training_args,
data_collator=collate_fn,
train_dataset=train_set,
eval_dataset=val_set,
)

trainer.train()

When the eval step is reached it goes to calculate the loss I get the below error:
KeyError: "The metric_for_best_model training argument is set to 'eval_loss', which is not found in the evaluation metrics. The available evaluation metrics are: []. Please ensure that the compute_metrics function returns a dictionary that includes 'eval_loss' or consider changing the metric_for_best_model via the TrainingArguments."

Please help me fix this. Or is it something to do with the model itself.

Sign up or log in to comment