Discrepancy in Model Performance Using HuggingFace Pipeline Utility

#2
by yangwang825 - opened

Hi @nielsr

I am attempting to reproduce the performance metrics of models using checkpoints from the Huggingface Hub and the original AST GitHub repository, but I am encountering different results. The recorded performance metrics were as follows:

Checkpoint mAP AUC-ROC
MIT/ast-finetuned-audioset-16-16-0.442 0.4040 0.9671
MIT/ast-finetuned-audioset-10-10-0.4593 0.4256 0.9737

These results do not closely align with the expected performance. Additionally, the number of parameters differ (86.6M for MIT/ast-finetuned-audioset-10-10-0.450 compared to 88.1M in the original implementation). FYI, I downloaded the AudioSet from this repo.

I have also opened an issue in the author's GitHub repository. Do you have any insights or thoughts on this matter? Any assistance would be greatly appreciated.

Sign up or log in to comment