Automatic Speech Recognition
Transformers
Safetensors
French
whisper
asr
Eval Results
Inference Endpoints
trip-fontaine commited on
Commit
954a00f
1 Parent(s): ea9549d

readme update

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -618,7 +618,7 @@ The model has been tested for both in-distribution (Common Voice 17 and Multilin
618
 
619
  ### Short-Form
620
 
621
- | Model Name | RTF | Common Voice 17 | Multilingual Librispeech | Voxpopuli | Fleurs |
622
  | :----------------: | :-----: | :-------------: | :----------------------: | :-------: | :----: |
623
  | distil-large-v3-fr | 310.127 | 12.681 | 5.865 | 10.851 | 7.984 |
624
  | whisper-tiny | 280.576 | 56.757 | 37.512 | 32.505 | 46.173 |
@@ -627,12 +627,13 @@ The model has been tested for both in-distribution (Common Voice 17 and Multilin
627
  | whisper-medium | 170.9 | 15.432 | 9.602 | 11.92 | 9.155 |
628
  | whisper-large-v3 | 150.719 | 11.024 | 4.783 | 9.948 | 5.624 |
629
 
630
- *the above datasets correspond to test splits, RTF co
631
 
 
632
  ### Long-Form
633
 
634
 
635
- | Model Name | RTF | [long-form test set](https://huggingface.co/datasets/eustlb/french-long-form-test) |
636
  | :----------------: | :-----: | :--------------------------------------------------------------------------------: |
637
  | distil-large-v3-fr | 169.692 | 11.385 |
638
  | whisper-tiny | 125.367 | 28.277 |
 
618
 
619
  ### Short-Form
620
 
621
+ | Model Name | RTFx | Common Voice 17 | Multilingual Librispeech | Voxpopuli | Fleurs |
622
  | :----------------: | :-----: | :-------------: | :----------------------: | :-------: | :----: |
623
  | distil-large-v3-fr | 310.127 | 12.681 | 5.865 | 10.851 | 7.984 |
624
  | whisper-tiny | 280.576 | 56.757 | 37.512 | 32.505 | 46.173 |
 
627
  | whisper-medium | 170.9 | 15.432 | 9.602 | 11.92 | 9.155 |
628
  | whisper-large-v3 | 150.719 | 11.024 | 4.783 | 9.948 | 5.624 |
629
 
630
+ *the above datasets correspond to test splits
631
 
632
+ *$RTFx =\frac{1}{RTF}$, where RTF is the [Real Time Factor](https://openvoice-tech.net/wiki/Real-time-factor). To be interpreted as audio processed (in seconds) per second of processing.
633
  ### Long-Form
634
 
635
 
636
+ | Model Name | RTFx | [long-form test set](https://huggingface.co/datasets/eustlb/french-long-form-test) |
637
  | :----------------: | :-----: | :--------------------------------------------------------------------------------: |
638
  | distil-large-v3-fr | 169.692 | 11.385 |
639
  | whisper-tiny | 125.367 | 28.277 |