agnesluhtaru commited on
Commit
276864c
1 Parent(s): e9c4c9b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ - whisper-event
6
+ metrics:
7
+ - wer
8
+ ---
9
+
10
+ # whisper-medium-et with ERR2020 data
11
+
12
+ This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the following datasets: Common Voice 11, VoxPopuli, FLEURS and [ERR2020](http://bark.phon.ioc.ee/lw/korpused/ERR2020.html).
13
+ The model is stopped a little early because the Whisper fine-tuning event was ending :)
14
+
15
+ ## Model description
16
+
17
+ More information needed
18
+
19
+ ## Intended uses & limitations
20
+
21
+ More information needed
22
+
23
+ ## Training and evaluation data
24
+
25
+ Estonian data from Common Voice 11, VoxPopuli, FLEURS and ERR2020 corpora as both training and validation sets. Tested on Common Voice 11 test set.
26
+
27
+ ## Training procedure
28
+
29
+ ### Training hyperparameters
30
+
31
+ The following hyperparameters were used during training:
32
+ - learning_rate: 1e-05
33
+ - train_batch_size: 32
34
+ - eval_batch_size: 16
35
+ - seed: 42
36
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
37
+ - lr_scheduler_type: linear
38
+ - lr_scheduler_warmup_steps: 500
39
+ - training_steps: 6000
40
+ - mixed_precision_training: Native AMP
41
+
42
+ ### Framework versions
43
+
44
+ - Transformers 4.26.0.dev0
45
+ - Pytorch 1.12.1+rocm5.1.1
46
+ - Datasets 2.7.1.dev0
47
+ - Tokenizers 0.13.2