Spaces:

hf-audio
/

open_asr_leaderboard

Running on CPU Upgrade

Define RTF

#12

by sanchit-gandhi - opened Feb 10, 2024

←

Files changed (1) hide show

constants.py CHANGED Viewed

@@ -60,7 +60,7 @@ Example: If it takes an ASR system 10 seconds to transcribe 10 seconds of speech
 If it takes 20 seconds to transcribe the same 10 seconds of speech, the RTF is 2.
 ```
-For the benchmark, we report RTF averaged over a 10 minute audio sample with 5 warm up batches followed 3 graded batches.
 ## How to reproduce our results

 If it takes 20 seconds to transcribe the same 10 seconds of speech, the RTF is 2.
 ```
+For the benchmark, we report RTF averaged over a 10 minute audio sample that is chunked into 30 second segments, mimicking the [chunked long-form transcription strategy](https://huggingface.co/blog/asr-chunking) performed in Transfomrmers. We measure RTF on an A100 80GB GPU (Driver Version: 535.54.03, CUDA Version: 12.2), performing 5 warm-up runs and 3 graded runs, over which the RTF is averaged to get the final result.
 ## How to reproduce our results