alvanlii commited on
Commit
4ec307d
1 Parent(s): a22264f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -23,14 +23,17 @@ model-index:
23
  metrics:
24
  - name: Normalized CER
25
  type: cer
26
- value: 9.77
27
  ---
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
  should probably proofread and complete it, then remove this comment. -->
30
 
31
  # Distil-Whisper Small zh-HK - Alvin
32
 
33
- This model is a distilled version of [alvanlii/whisper-small-cantonese](https://huggingface.co/alvanlii/whisper-small-cantonese) on the Cantonese language. It achieves a 9.77 CER (without punctuations), 11.7 CER (with punctuations) on Common Voice 16.0. It has 6 decoder layers instead of 12.
 
 
 
34
 
35
  ## Training and evaluation data
36
  For training,
@@ -40,14 +43,15 @@ For training,
40
 
41
  For evaluation, Common Voice 16.0 yue Test set is used.
42
 
43
- ## Results
44
- - CER (lower is better): 0.117 (compared to 0.107 for `alvanlii/whisper-small-cantonese`)
45
- - GPU Inference with Fast Attention (sdpa): 0.039s/sample (down from 0.055s)
46
- - Note all GPU evaluations are done on RTX 3090 GPU
47
- - GPU Inference: 0.041s/sample (down from 0.308s)
48
- - CPU Inference: 1.7s/sample (down from 2.57s)
49
- - GPU VRAM: ~2 GB
50
 
 
51
 
52
  ## Using the Model
53
  ```
 
23
  metrics:
24
  - name: Normalized CER
25
  type: cer
26
+ value: 9.7
27
  ---
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
  should probably proofread and complete it, then remove this comment. -->
30
 
31
  # Distil-Whisper Small zh-HK - Alvin
32
 
33
+ - This model is a distilled version of [alvanlii/whisper-small-cantonese](https://huggingface.co/alvanlii/whisper-small-cantonese) on the Cantonese language.
34
+ - Achieves a 9.7 CER (without punctuations), 11.59 CER (with punctuations) on Common Voice 16.0.
35
+ - Has 3 decoder layers instead of regular 12 of the Whisper small model.
36
+ - Uses ~2GB of GPU VRAM
37
 
38
  ## Training and evaluation data
39
  For training,
 
43
 
44
  For evaluation, Common Voice 16.0 yue Test set is used.
45
 
46
+ ## Comparisons to Whisper Small
47
+ ||`alvanlii/distil-whisper-small-cantonese`|`alvanlii/whisper-small-cantonese`|
48
+ |--|--|--|
49
+ |CER (lower is better)|0.116|0.107|
50
+ |GPU Inference time (sdpa) [s/sample]|0.039|0.055|
51
+ |GPU Inference (regular) [s/sample]|0.041|0.308|
52
+ |CPU Inference [s/sample]|1.7|2.57|
53
 
54
+ - inference time is calculated by taking the average inference time for the CV16 yue test set
55
 
56
  ## Using the Model
57
  ```