Update README.md
Browse files
README.md
CHANGED
@@ -35,18 +35,41 @@ More information needed
|
|
35 |
|
36 |
### Training hyperparameters
|
37 |
|
38 |
-
The following hyperparameters were used during training:
|
39 |
-
|
40 |
-
|
41 |
-
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
### Training results
|
52 |
|
@@ -61,5 +84,5 @@ The following hyperparameters were used during training:
|
|
61 |
|
62 |
- Transformers 4.17.0.dev0
|
63 |
- Pytorch 1.10.2+cu102
|
64 |
-
- Datasets 1.18.
|
65 |
- Tokenizers 0.11.0
|
|
|
35 |
|
36 |
### Training hyperparameters
|
37 |
|
38 |
+
The following config and hyperparameters were used during training:
|
39 |
+
|
40 |
+
model = Wav2Vec2ForCTC.from_pretrained(
|
41 |
+
"facebook/wav2vec2-xls-r-1b",
|
42 |
+
attention_dropout=0.05,
|
43 |
+
hidden_dropout=0.05,
|
44 |
+
feat_proj_dropout=0.05,
|
45 |
+
mask_time_prob=0.55,
|
46 |
+
mask_feature_prob=0.10,
|
47 |
+
layerdrop=0.05,
|
48 |
+
ctc_zero_infinity=True,
|
49 |
+
ctc_loss_reduction="mean",
|
50 |
+
pad_token_id=processor.tokenizer.pad_token_id,
|
51 |
+
vocab_size=len(processor.tokenizer),
|
52 |
+
)
|
53 |
+
|
54 |
+
from transformers import TrainingArguments
|
55 |
+
|
56 |
+
training_args = TrainingArguments(
|
57 |
+
output_dir=repo_name,
|
58 |
+
group_by_length=True,
|
59 |
+
per_device_train_batch_size=32,
|
60 |
+
gradient_accumulation_steps=2,
|
61 |
+
evaluation_strategy="steps",
|
62 |
+
num_train_epochs=50,
|
63 |
+
gradient_checkpointing=True,
|
64 |
+
fp16=True,
|
65 |
+
save_steps=400,
|
66 |
+
eval_steps=400,
|
67 |
+
logging_steps=400,
|
68 |
+
learning_rate=5.5e-05,
|
69 |
+
warmup_steps=500,
|
70 |
+
save_total_limit=2,
|
71 |
+
push_to_hub=True,
|
72 |
+
report_to="tensorboard")
|
73 |
|
74 |
### Training results
|
75 |
|
|
|
84 |
|
85 |
- Transformers 4.17.0.dev0
|
86 |
- Pytorch 1.10.2+cu102
|
87 |
+
- Datasets 1.18.3
|
88 |
- Tokenizers 0.11.0
|