eustlb
/

distil-large-v3-fr

@@ -83,11 +83,11 @@ The result is a distilled model that performs within **2% WER of [Whisper large-
 | Model                  | Params (M) | Rel. Latency | Short-Form WER | Long-Form WER |
 | :--------------------- | :--------: | :----------: | :------------: | :-----------: |
-| whisper-tiny           |    37.8    |     4.7      |     43.73      |     28.158    |
-| whisper-base           |    72.6    |     3.7      |     30.57      |     18.665    |
-| whisper-small          |    242     |     2.3      |     16.20      |     12.557    |
-| whisper-medium         |    764     |     1.3      |     11.720     |     11.023    |
-| whisper-large-v3       |    1540    |     1.0      |      7.81      |      9.008    |
 | **distil-large-v3-fr** |  **756**   |   **5.9**    |    **9.34**    |   **11.13**   |
 *latencies benchmarked to generate 128 tokens on A100 40GB with a batch size of 1. More details about inference performances in [inference speed](#inference-speed) section.
@@ -634,14 +634,14 @@ The model has been tested for both in-distribution (Common Voice 17 and Multilin
 ### Long-Form
-|     Model Name     |  RTFx   | [long-form test set](https://huggingface.co/datasets/eustlb/french-long-form-test) |
-| :----------------: | :-----: | :--------------------------------------------------------------------------------: |
-|    whisper-tiny    | 121.389 |                                       28.158                                       |
-|    whisper-base    | 109.366 |                                       18.665                                       |
-|   whisper-small    | 83.049  |                                       12.557                                       |
-|   whisper-medium   | 47.807  |                                       11.023                                       |
-|  whisper-large-v3  | 38.294  |                                       9.008                                        |
-| distil-large-v3-fr | 101.326 |                                       11.13                                        |

 | Model                  | Params (M) | Rel. Latency | Short-Form WER | Long-Form WER |
 | :--------------------- | :--------: | :----------: | :------------: | :-----------: |
+| whisper-tiny           |    37.8    |     4.7      |     43.73      |    28.158     |
+| whisper-base           |    72.6    |     3.7      |     30.57      |    18.665     |
+| whisper-small          |    242     |     2.3      |     16.20      |    12.557     |
+| whisper-medium         |    764     |     1.3      |     11.720     |    11.023     |
+| whisper-large-v3       |    1540    |     1.0      |      7.81      |     9.008     |
 | **distil-large-v3-fr** |  **756**   |   **5.9**    |    **9.34**    |   **11.13**   |
 *latencies benchmarked to generate 128 tokens on A100 40GB with a batch size of 1. More details about inference performances in [inference speed](#inference-speed) section.
 ### Long-Form
+|       Model Name       |    RTFx     | [long-form test set](https://huggingface.co/datasets/eustlb/french-long-form-test) |
+| :--------------------: | :---------: | :--------------------------------------------------------------------------------: |
+|      whisper-tiny      |   121.389   |                                       28.158                                       |
+|      whisper-base      |   109.366   |                                       18.665                                       |
+|     whisper-small      |   83.049    |                                       12.557                                       |
+|     whisper-medium     |   47.807    |                                       11.023                                       |
+|    whisper-large-v3    |   38.294    |                                       9.008                                        |
+| **distil-large-v3-fr** | **101.326** |                                     **11.13**                                      |