Note: Those are only the weights for the classifier trained on the whisper-small embeddings.

Result of the classifier Rob's human-annotated dataset (data/voicemail_human_eval.csv):

Results for chunk size 1 seconds:

  • Accuracy: 0.7480
  • Precision: 0.8681
  • Recall: 0.7396
  • F1 Score: 0.7987

Results for chunk size 2 seconds:

  • Accuracy: 0.7880
  • Precision: 0.9085
  • Recall: 0.7633
  • F1 Score: 0.8296

Results for chunk size 5 seconds:

  • Accuracy: 0.8480
  • Precision: 0.9456
  • Recall: 0.8225
  • F1 Score: 0.8797

Results for chunk size 10 seconds:

  • Accuracy: 0.8720
  • Precision: 0.9790
  • Recall: 0.8284
  • F1 Score: 0.8974

Results for full audio samples:

  • Accuracy: 0.8760
  • Precision: 0.9929
  • Recall: 0.8225
  • F1 Score: 0.8997
Downloads last month
12
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including SynthflowAI/whisper-small_voicemail_classification_pre_finetuning