Evaluation Scores

#5
by bdaniela - opened

Hi, I ran inference with your model on the CNN/DailyMail dataset. I obtained a BERTScore of 87.68, which is significantly different from the reported score of 74.92. Could you please clarify if these scores were evaluated on the validation or test set? Additionally, did you evaluate the model on the entire dataset or just a subset?
Furthermore, did you perform any preprocessing on the data before running the inference? If possible, could you share the code used for running inference?
Thanks.

Sign up or log in to comment