jaekookang commited on
Commit
1f4da1a
1 Parent(s): 9207f1b

Update model

Browse files
Files changed (20) hide show
  1. README.md +328 -3
  2. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/RESULTS.md +33 -0
  3. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/config.yaml +226 -0
  4. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/acc.png +0 -0
  5. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/backward_time.png +0 -0
  6. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/cer.png +0 -0
  7. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/cer_ctc.png +0 -0
  8. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/forward_time.png +0 -0
  9. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/gpu_max_cached_mem_GB.png +0 -0
  10. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/iter_time.png +0 -0
  11. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss.png +0 -0
  12. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss_att.png +0 -0
  13. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss_ctc.png +0 -0
  14. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/optim0_lr0.png +0 -0
  15. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/optim_step_time.png +0 -0
  16. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/train_time.png +0 -0
  17. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/wer.png +0 -0
  18. exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/valid.acc.ave_10best.pth +3 -0
  19. exp/asr_stats_raw_en_char_sp/train/feats_stats.npz +0 -0
  20. meta.yaml +8 -0
README.md CHANGED
@@ -1,3 +1,328 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: noinfo
7
+ datasets:
8
+ - librispeech_100
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `jkang/espnet2_librispeech_100_conformer_char`
15
+
16
+ This model was trained by jaekookang using librispeech_100 recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ ```bash
21
+ cd espnet
22
+ git checkout 82a0a0fa97b8a4a578f0a2c031ec49b3afec1504
23
+ pip install -e .
24
+ cd egs2/librispeech_100/asr1
25
+ ./run.sh --skip_data_prep false --skip_train true --download_model jkang/espnet2_librispeech_100_conformer_char
26
+ ```
27
+
28
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
29
+ # RESULTS
30
+ ## Environments
31
+ - date: `Thu Feb 24 17:47:04 KST 2022`
32
+ - python version: `3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]`
33
+ - espnet version: `espnet 0.10.7a1`
34
+ - pytorch version: `pytorch 1.10.1`
35
+ - Git hash: `82a0a0fa97b8a4a578f0a2c031ec49b3afec1504`
36
+ - Commit date: `Wed Feb 23 08:06:47 2022 +0900`
37
+
38
+ ## asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char
39
+ ### WER
40
+
41
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
42
+ |---|---|---|---|---|---|---|---|---|
43
+ |decode_asr_asr_model_valid.acc.ave/dev_clean|2703|54402|93.9|5.6|0.5|0.7|6.8|57.1|
44
+ |decode_asr_asr_model_valid.acc.ave/dev_other|2864|50948|82.5|15.7|1.8|1.9|19.3|82.6|
45
+ |decode_asr_asr_model_valid.acc.ave/test_clean|2620|52576|93.8|5.7|0.6|0.7|6.9|58.4|
46
+ |decode_asr_asr_model_valid.acc.ave/test_other|2939|52343|82.2|15.9|2.0|1.7|19.5|83.6|
47
+
48
+ ### CER
49
+
50
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
51
+ |---|---|---|---|---|---|---|---|---|
52
+ |decode_asr_asr_model_valid.acc.ave/dev_clean|2703|288456|98.3|1.0|0.7|0.7|2.4|57.1|
53
+ |decode_asr_asr_model_valid.acc.ave/dev_other|2864|265951|93.3|4.1|2.6|1.9|8.7|82.6|
54
+ |decode_asr_asr_model_valid.acc.ave/test_clean|2620|281530|98.3|1.0|0.7|0.6|2.3|58.4|
55
+ |decode_asr_asr_model_valid.acc.ave/test_other|2939|272758|93.2|4.1|2.7|1.8|8.6|83.6|
56
+
57
+ ### TER
58
+
59
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
60
+ |---|---|---|---|---|---|---|---|---|
61
+
62
+ ## ASR config
63
+
64
+ <details><summary>expand</summary>
65
+
66
+ ```
67
+ config: conf/train_asr_char.yaml
68
+ print_config: false
69
+ log_level: INFO
70
+ dry_run: false
71
+ iterator_type: sequence
72
+ output_dir: exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char
73
+ ngpu: 1
74
+ seed: 2022
75
+ num_workers: 4
76
+ num_att_plot: 0
77
+ dist_backend: nccl
78
+ dist_init_method: env://
79
+ dist_world_size: null
80
+ dist_rank: null
81
+ local_rank: 0
82
+ dist_master_addr: null
83
+ dist_master_port: null
84
+ dist_launcher: null
85
+ multiprocessing_distributed: false
86
+ unused_parameters: false
87
+ sharded_ddp: false
88
+ cudnn_enabled: true
89
+ cudnn_benchmark: false
90
+ cudnn_deterministic: false
91
+ collect_stats: false
92
+ write_collected_feats: false
93
+ max_epoch: 70
94
+ patience: null
95
+ val_scheduler_criterion:
96
+ - valid
97
+ - loss
98
+ early_stopping_criterion:
99
+ - valid
100
+ - loss
101
+ - min
102
+ best_model_criterion:
103
+ - - valid
104
+ - acc
105
+ - max
106
+ keep_nbest_models: 10
107
+ nbest_averaging_interval: 0
108
+ grad_clip: 5.0
109
+ grad_clip_type: 2.0
110
+ grad_noise: false
111
+ accum_grad: 4
112
+ no_forward_run: false
113
+ resume: true
114
+ train_dtype: float32
115
+ use_amp: true
116
+ log_interval: 400
117
+ use_matplotlib: true
118
+ use_tensorboard: true
119
+ use_wandb: false
120
+ wandb_project: null
121
+ wandb_id: null
122
+ wandb_entity: null
123
+ wandb_name: null
124
+ wandb_model_log_interval: -1
125
+ detect_anomaly: false
126
+ pretrain_path: null
127
+ init_param: []
128
+ ignore_init_mismatch: false
129
+ freeze_param: []
130
+ num_iters_per_epoch: null
131
+ batch_size: 20
132
+ valid_batch_size: null
133
+ batch_bins: 1600000
134
+ valid_batch_bins: null
135
+ train_shape_file:
136
+ - exp/asr_stats_raw_en_char_sp/train/speech_shape
137
+ - exp/asr_stats_raw_en_char_sp/train/text_shape.char
138
+ valid_shape_file:
139
+ - exp/asr_stats_raw_en_char_sp/valid/speech_shape
140
+ - exp/asr_stats_raw_en_char_sp/valid/text_shape.char
141
+ batch_type: numel
142
+ valid_batch_type: null
143
+ fold_length:
144
+ - 80000
145
+ - 150
146
+ sort_in_batch: descending
147
+ sort_batch: descending
148
+ multiple_iterator: false
149
+ chunk_length: 500
150
+ chunk_shift_ratio: 0.5
151
+ num_cache_chunks: 1024
152
+ train_data_path_and_name_and_type:
153
+ - - dump/raw/train_clean_100_sp/wav.scp
154
+ - speech
155
+ - kaldi_ark
156
+ - - dump/raw/train_clean_100_sp/text
157
+ - text
158
+ - text
159
+ valid_data_path_and_name_and_type:
160
+ - - dump/raw/dev/wav.scp
161
+ - speech
162
+ - kaldi_ark
163
+ - - dump/raw/dev/text
164
+ - text
165
+ - text
166
+ allow_variable_data_keys: false
167
+ max_cache_size: 0.0
168
+ max_cache_fd: 32
169
+ valid_max_cache_size: null
170
+ optim: adam
171
+ optim_conf:
172
+ lr: 0.002
173
+ weight_decay: 1.0e-06
174
+ scheduler: warmuplr
175
+ scheduler_conf:
176
+ warmup_steps: 15000
177
+ token_list:
178
+ - <blank>
179
+ - <unk>
180
+ - <space>
181
+ - E
182
+ - T
183
+ - A
184
+ - O
185
+ - N
186
+ - I
187
+ - H
188
+ - S
189
+ - R
190
+ - D
191
+ - L
192
+ - U
193
+ - M
194
+ - C
195
+ - W
196
+ - F
197
+ - G
198
+ - Y
199
+ - P
200
+ - B
201
+ - V
202
+ - K
203
+ - ''''
204
+ - X
205
+ - J
206
+ - Q
207
+ - Z
208
+ - <sos/eos>
209
+ init: null
210
+ input_size: null
211
+ ctc_conf:
212
+ dropout_rate: 0.0
213
+ ctc_type: builtin
214
+ reduce: true
215
+ ignore_nan_grad: true
216
+ joint_net_conf: null
217
+ model_conf:
218
+ ctc_weight: 0.3
219
+ lsm_weight: 0.1
220
+ length_normalized_loss: false
221
+ use_preprocessor: true
222
+ token_type: char
223
+ bpemodel: null
224
+ non_linguistic_symbols: null
225
+ cleaner: null
226
+ g2p: null
227
+ speech_volume_normalize: null
228
+ rir_scp: null
229
+ rir_apply_prob: 1.0
230
+ noise_scp: null
231
+ noise_apply_prob: 1.0
232
+ noise_db_range: '13_15'
233
+ frontend: default
234
+ frontend_conf:
235
+ n_fft: 512
236
+ win_length: 400
237
+ hop_length: 160
238
+ fs: 16k
239
+ specaug: specaug
240
+ specaug_conf:
241
+ apply_time_warp: true
242
+ time_warp_window: 5
243
+ time_warp_mode: bicubic
244
+ apply_freq_mask: true
245
+ freq_mask_width_range:
246
+ - 0
247
+ - 27
248
+ num_freq_mask: 2
249
+ apply_time_mask: true
250
+ time_mask_width_ratio_range:
251
+ - 0.0
252
+ - 0.05
253
+ num_time_mask: 5
254
+ normalize: global_mvn
255
+ normalize_conf:
256
+ stats_file: exp/asr_stats_raw_en_char_sp/train/feats_stats.npz
257
+ preencoder: null
258
+ preencoder_conf: {}
259
+ encoder: conformer
260
+ encoder_conf:
261
+ output_size: 256
262
+ attention_heads: 4
263
+ linear_units: 1024
264
+ num_blocks: 12
265
+ dropout_rate: 0.1
266
+ positional_dropout_rate: 0.1
267
+ attention_dropout_rate: 0.1
268
+ input_layer: conv2d
269
+ normalize_before: true
270
+ macaron_style: true
271
+ rel_pos_type: latest
272
+ pos_enc_layer_type: rel_pos
273
+ selfattention_layer_type: rel_selfattn
274
+ activation_type: swish
275
+ use_cnn_module: true
276
+ cnn_module_kernel: 31
277
+ postencoder: null
278
+ postencoder_conf: {}
279
+ decoder: transformer
280
+ decoder_conf:
281
+ attention_heads: 4
282
+ linear_units: 2048
283
+ num_blocks: 6
284
+ dropout_rate: 0.1
285
+ positional_dropout_rate: 0.1
286
+ self_attention_dropout_rate: 0.1
287
+ src_attention_dropout_rate: 0.1
288
+ required:
289
+ - output_dir
290
+ - token_list
291
+ version: 0.10.7a1
292
+ distributed: false
293
+ ```
294
+
295
+ </details>
296
+
297
+
298
+
299
+ ### Citing ESPnet
300
+
301
+ ```BibTex
302
+ @inproceedings{watanabe2018espnet,
303
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
304
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
305
+ year={2018},
306
+ booktitle={Proceedings of Interspeech},
307
+ pages={2207--2211},
308
+ doi={10.21437/Interspeech.2018-1456},
309
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
310
+ }
311
+
312
+
313
+
314
+
315
+ ```
316
+
317
+ or arXiv:
318
+
319
+ ```bibtex
320
+ @misc{watanabe2018espnet,
321
+ title={ESPnet: End-to-End Speech Processing Toolkit},
322
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
323
+ year={2018},
324
+ eprint={1804.00015},
325
+ archivePrefix={arXiv},
326
+ primaryClass={cs.CL}
327
+ }
328
+ ```
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/RESULTS.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Thu Feb 24 17:47:04 KST 2022`
5
+ - python version: `3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]`
6
+ - espnet version: `espnet 0.10.7a1`
7
+ - pytorch version: `pytorch 1.10.1`
8
+ - Git hash: `82a0a0fa97b8a4a578f0a2c031ec49b3afec1504`
9
+ - Commit date: `Wed Feb 23 08:06:47 2022 +0900`
10
+
11
+ ## asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |decode_asr_asr_model_valid.acc.ave/dev_clean|2703|54402|93.9|5.6|0.5|0.7|6.8|57.1|
17
+ |decode_asr_asr_model_valid.acc.ave/dev_other|2864|50948|82.5|15.7|1.8|1.9|19.3|82.6|
18
+ |decode_asr_asr_model_valid.acc.ave/test_clean|2620|52576|93.8|5.7|0.6|0.7|6.9|58.4|
19
+ |decode_asr_asr_model_valid.acc.ave/test_other|2939|52343|82.2|15.9|2.0|1.7|19.5|83.6|
20
+
21
+ ### CER
22
+
23
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
24
+ |---|---|---|---|---|---|---|---|---|
25
+ |decode_asr_asr_model_valid.acc.ave/dev_clean|2703|288456|98.3|1.0|0.7|0.7|2.4|57.1|
26
+ |decode_asr_asr_model_valid.acc.ave/dev_other|2864|265951|93.3|4.1|2.6|1.9|8.7|82.6|
27
+ |decode_asr_asr_model_valid.acc.ave/test_clean|2620|281530|98.3|1.0|0.7|0.6|2.3|58.4|
28
+ |decode_asr_asr_model_valid.acc.ave/test_other|2939|272758|93.2|4.1|2.7|1.8|8.6|83.6|
29
+
30
+ ### TER
31
+
32
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
33
+ |---|---|---|---|---|---|---|---|---|
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/config.yaml ADDED
@@ -0,0 +1,226 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/train_asr_char.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char
7
+ ngpu: 1
8
+ seed: 2022
9
+ num_workers: 4
10
+ num_att_plot: 0
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: null
14
+ dist_rank: null
15
+ local_rank: 0
16
+ dist_master_addr: null
17
+ dist_master_port: null
18
+ dist_launcher: null
19
+ multiprocessing_distributed: false
20
+ unused_parameters: false
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: false
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 70
28
+ patience: null
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - valid
38
+ - acc
39
+ - max
40
+ keep_nbest_models: 10
41
+ nbest_averaging_interval: 0
42
+ grad_clip: 5.0
43
+ grad_clip_type: 2.0
44
+ grad_noise: false
45
+ accum_grad: 4
46
+ no_forward_run: false
47
+ resume: true
48
+ train_dtype: float32
49
+ use_amp: true
50
+ log_interval: 400
51
+ use_matplotlib: true
52
+ use_tensorboard: true
53
+ use_wandb: false
54
+ wandb_project: null
55
+ wandb_id: null
56
+ wandb_entity: null
57
+ wandb_name: null
58
+ wandb_model_log_interval: -1
59
+ detect_anomaly: false
60
+ pretrain_path: null
61
+ init_param: []
62
+ ignore_init_mismatch: false
63
+ freeze_param: []
64
+ num_iters_per_epoch: null
65
+ batch_size: 20
66
+ valid_batch_size: null
67
+ batch_bins: 1600000
68
+ valid_batch_bins: null
69
+ train_shape_file:
70
+ - exp/asr_stats_raw_en_char_sp/train/speech_shape
71
+ - exp/asr_stats_raw_en_char_sp/train/text_shape.char
72
+ valid_shape_file:
73
+ - exp/asr_stats_raw_en_char_sp/valid/speech_shape
74
+ - exp/asr_stats_raw_en_char_sp/valid/text_shape.char
75
+ batch_type: numel
76
+ valid_batch_type: null
77
+ fold_length:
78
+ - 80000
79
+ - 150
80
+ sort_in_batch: descending
81
+ sort_batch: descending
82
+ multiple_iterator: false
83
+ chunk_length: 500
84
+ chunk_shift_ratio: 0.5
85
+ num_cache_chunks: 1024
86
+ train_data_path_and_name_and_type:
87
+ - - dump/raw/train_clean_100_sp/wav.scp
88
+ - speech
89
+ - kaldi_ark
90
+ - - dump/raw/train_clean_100_sp/text
91
+ - text
92
+ - text
93
+ valid_data_path_and_name_and_type:
94
+ - - dump/raw/dev/wav.scp
95
+ - speech
96
+ - kaldi_ark
97
+ - - dump/raw/dev/text
98
+ - text
99
+ - text
100
+ allow_variable_data_keys: false
101
+ max_cache_size: 0.0
102
+ max_cache_fd: 32
103
+ valid_max_cache_size: null
104
+ optim: adam
105
+ optim_conf:
106
+ lr: 0.002
107
+ weight_decay: 1.0e-06
108
+ scheduler: warmuplr
109
+ scheduler_conf:
110
+ warmup_steps: 15000
111
+ token_list:
112
+ - <blank>
113
+ - <unk>
114
+ - <space>
115
+ - E
116
+ - T
117
+ - A
118
+ - O
119
+ - N
120
+ - I
121
+ - H
122
+ - S
123
+ - R
124
+ - D
125
+ - L
126
+ - U
127
+ - M
128
+ - C
129
+ - W
130
+ - F
131
+ - G
132
+ - Y
133
+ - P
134
+ - B
135
+ - V
136
+ - K
137
+ - ''''
138
+ - X
139
+ - J
140
+ - Q
141
+ - Z
142
+ - <sos/eos>
143
+ init: null
144
+ input_size: null
145
+ ctc_conf:
146
+ dropout_rate: 0.0
147
+ ctc_type: builtin
148
+ reduce: true
149
+ ignore_nan_grad: true
150
+ joint_net_conf: null
151
+ model_conf:
152
+ ctc_weight: 0.3
153
+ lsm_weight: 0.1
154
+ length_normalized_loss: false
155
+ use_preprocessor: true
156
+ token_type: char
157
+ bpemodel: null
158
+ non_linguistic_symbols: null
159
+ cleaner: null
160
+ g2p: null
161
+ speech_volume_normalize: null
162
+ rir_scp: null
163
+ rir_apply_prob: 1.0
164
+ noise_scp: null
165
+ noise_apply_prob: 1.0
166
+ noise_db_range: '13_15'
167
+ frontend: default
168
+ frontend_conf:
169
+ n_fft: 512
170
+ win_length: 400
171
+ hop_length: 160
172
+ fs: 16k
173
+ specaug: specaug
174
+ specaug_conf:
175
+ apply_time_warp: true
176
+ time_warp_window: 5
177
+ time_warp_mode: bicubic
178
+ apply_freq_mask: true
179
+ freq_mask_width_range:
180
+ - 0
181
+ - 27
182
+ num_freq_mask: 2
183
+ apply_time_mask: true
184
+ time_mask_width_ratio_range:
185
+ - 0.0
186
+ - 0.05
187
+ num_time_mask: 5
188
+ normalize: global_mvn
189
+ normalize_conf:
190
+ stats_file: exp/asr_stats_raw_en_char_sp/train/feats_stats.npz
191
+ preencoder: null
192
+ preencoder_conf: {}
193
+ encoder: conformer
194
+ encoder_conf:
195
+ output_size: 256
196
+ attention_heads: 4
197
+ linear_units: 1024
198
+ num_blocks: 12
199
+ dropout_rate: 0.1
200
+ positional_dropout_rate: 0.1
201
+ attention_dropout_rate: 0.1
202
+ input_layer: conv2d
203
+ normalize_before: true
204
+ macaron_style: true
205
+ rel_pos_type: latest
206
+ pos_enc_layer_type: rel_pos
207
+ selfattention_layer_type: rel_selfattn
208
+ activation_type: swish
209
+ use_cnn_module: true
210
+ cnn_module_kernel: 31
211
+ postencoder: null
212
+ postencoder_conf: {}
213
+ decoder: transformer
214
+ decoder_conf:
215
+ attention_heads: 4
216
+ linear_units: 2048
217
+ num_blocks: 6
218
+ dropout_rate: 0.1
219
+ positional_dropout_rate: 0.1
220
+ self_attention_dropout_rate: 0.1
221
+ src_attention_dropout_rate: 0.1
222
+ required:
223
+ - output_dir
224
+ - token_list
225
+ version: 0.10.7a1
226
+ distributed: false
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/acc.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/backward_time.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/cer.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/cer_ctc.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/forward_time.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/gpu_max_cached_mem_GB.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/iter_time.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss_att.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/loss_ctc.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/optim0_lr0.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/optim_step_time.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/train_time.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/images/wer.png ADDED
exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/valid.acc.ave_10best.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07e65103c5ea68052a6045f7163c4777891c94d9b1a0185987cad82e33c21f85
3
+ size 121958821
exp/asr_stats_raw_en_char_sp/train/feats_stats.npz ADDED
Binary file (1.4 kB). View file
 
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: 0.10.7a1
2
+ files:
3
+ asr_model_file: exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/valid.acc.ave_10best.pth
4
+ python: "3.9.7 (default, Sep 16 2021, 13:09:58) \n[GCC 7.5.0]"
5
+ timestamp: 1645692425.526674
6
+ torch: 1.10.1
7
+ yaml_files:
8
+ asr_train_config: exp/asr_conformer_lr2e-3_warmup15k_amp_nondeterministic_char/config.yaml