ESPnet
multilingual
audio
speaker-recognition
jungjee commited on
Commit
cfca21b
1 Parent(s): e3c03ac

Update model

Browse files
Files changed (23) hide show
  1. README.md +292 -0
  2. meta.yaml +8 -0
  3. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/9epoch.pth +3 -0
  4. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/RESULTS.md +17 -0
  5. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/config.yaml +201 -0
  6. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/backward_time.png +0 -0
  7. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/clip.png +0 -0
  8. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/eer.png +0 -0
  9. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/forward_time.png +0 -0
  10. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/gpu_max_cached_mem_GB.png +0 -0
  11. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/grad_norm.png +0 -0
  12. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/iter_time.png +0 -0
  13. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/loss.png +0 -0
  14. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/loss_scale.png +0 -0
  15. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/mindcf.png +0 -0
  16. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/n_trials.png +0 -0
  17. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/nontrg_mean.png +0 -0
  18. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/nontrg_std.png +0 -0
  19. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/optim0_lr0.png +0 -0
  20. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/optim_step_time.png +0 -0
  21. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/train_time.png +0 -0
  22. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/trg_mean.png +0 -0
  23. save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/trg_std.png +0 -0
README.md ADDED
@@ -0,0 +1,292 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - speaker-recognition
6
+ language: multilingual
7
+ datasets:
8
+ - voxceleb
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 SPK model
13
+
14
+ ### `espnet/voxcelebs12_ska_mel`
15
+
16
+ This model was trained by Jungjee using voxceleb recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
+ if you haven't done that already.
22
+
23
+ ```bash
24
+ cd espnet
25
+ git checkout ea74d1c7482bf5b3b4f90410d1ca8521fd9a566b
26
+ pip install -e .
27
+ cd egs2/voxceleb/spk1
28
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/voxcelebs12_ska_mel
29
+ ```
30
+
31
+ <!-- Generated by scripts/utils/show_spk_result.py -->
32
+ # RESULTS
33
+ ## Environments
34
+ date: 2023-12-05 14:44:39.571659
35
+
36
+ - python version: 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
37
+ - espnet version: 202310
38
+ - pytorch version: 2.0.1
39
+
40
+ | | Mean | Std |
41
+ |---|---|---|
42
+ | Target | -0.7834 | 0.1328 |
43
+ | Non-target | 0.0857 | 0.0857 |
44
+
45
+ | Model name | EER(%) | minDCF |
46
+ |---|---|---|
47
+ | conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk | 0.755 | 0.04722 |
48
+
49
+ ## SPK config
50
+
51
+ <details><summary>expand</summary>
52
+
53
+ ```
54
+ config: conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk.yaml
55
+ print_config: false
56
+ log_level: INFO
57
+ drop_last_iter: true
58
+ dry_run: false
59
+ iterator_type: category
60
+ valid_iterator_type: sequence
61
+ output_dir: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp
62
+ ngpu: 1
63
+ seed: 0
64
+ num_workers: 6
65
+ num_att_plot: 0
66
+ dist_backend: nccl
67
+ dist_init_method: env://
68
+ dist_world_size: 4
69
+ dist_rank: 0
70
+ local_rank: 0
71
+ dist_master_addr: localhost
72
+ dist_master_port: 34991
73
+ dist_launcher: null
74
+ multiprocessing_distributed: true
75
+ unused_parameters: false
76
+ sharded_ddp: false
77
+ cudnn_enabled: true
78
+ cudnn_benchmark: true
79
+ cudnn_deterministic: false
80
+ collect_stats: false
81
+ write_collected_feats: false
82
+ max_epoch: 40
83
+ patience: null
84
+ val_scheduler_criterion:
85
+ - valid
86
+ - loss
87
+ early_stopping_criterion:
88
+ - valid
89
+ - loss
90
+ - min
91
+ best_model_criterion:
92
+ - - valid
93
+ - eer
94
+ - min
95
+ keep_nbest_models: 3
96
+ nbest_averaging_interval: 0
97
+ grad_clip: 9999
98
+ grad_clip_type: 2.0
99
+ grad_noise: false
100
+ accum_grad: 1
101
+ no_forward_run: false
102
+ resume: true
103
+ train_dtype: float32
104
+ use_amp: true
105
+ log_interval: 100
106
+ use_matplotlib: true
107
+ use_tensorboard: true
108
+ create_graph_in_tensorboard: false
109
+ use_wandb: false
110
+ wandb_project: null
111
+ wandb_id: null
112
+ wandb_entity: null
113
+ wandb_name: null
114
+ wandb_model_log_interval: -1
115
+ detect_anomaly: false
116
+ use_lora: false
117
+ save_lora_only: true
118
+ lora_conf: {}
119
+ pretrain_path: null
120
+ init_param: []
121
+ ignore_init_mismatch: false
122
+ freeze_param: []
123
+ num_iters_per_epoch: null
124
+ batch_size: 512
125
+ valid_batch_size: 40
126
+ batch_bins: 1000000
127
+ valid_batch_bins: null
128
+ train_shape_file:
129
+ - exp/spk_stats_16k_sp/train/speech_shape
130
+ valid_shape_file:
131
+ - exp/spk_stats_16k_sp/valid/speech_shape
132
+ batch_type: folded
133
+ valid_batch_type: null
134
+ fold_length:
135
+ - 120000
136
+ sort_in_batch: descending
137
+ shuffle_within_batch: false
138
+ sort_batch: descending
139
+ multiple_iterator: false
140
+ chunk_length: 500
141
+ chunk_shift_ratio: 0.5
142
+ num_cache_chunks: 1024
143
+ chunk_excluded_key_prefixes: []
144
+ chunk_default_fs: null
145
+ train_data_path_and_name_and_type:
146
+ - - dump/raw/voxceleb12_devs_sp/wav.scp
147
+ - speech
148
+ - sound
149
+ - - dump/raw/voxceleb12_devs_sp/utt2spk
150
+ - spk_labels
151
+ - text
152
+ valid_data_path_and_name_and_type:
153
+ - - dump/raw/voxceleb1_test/trial.scp
154
+ - speech
155
+ - sound
156
+ - - dump/raw/voxceleb1_test/trial2.scp
157
+ - speech2
158
+ - sound
159
+ - - dump/raw/voxceleb1_test/trial_label
160
+ - spk_labels
161
+ - text
162
+ allow_variable_data_keys: false
163
+ max_cache_size: 0.0
164
+ max_cache_fd: 32
165
+ allow_multi_rates: false
166
+ valid_max_cache_size: null
167
+ exclude_weight_decay: false
168
+ exclude_weight_decay_conf: {}
169
+ optim: adam
170
+ optim_conf:
171
+ lr: 0.001
172
+ weight_decay: 5.0e-05
173
+ amsgrad: false
174
+ scheduler: cosineannealingwarmuprestarts
175
+ scheduler_conf:
176
+ first_cycle_steps: 71280
177
+ cycle_mult: 1.0
178
+ max_lr: 0.001
179
+ min_lr: 5.0e-06
180
+ warmup_steps: 1000
181
+ gamma: 0.75
182
+ init: null
183
+ use_preprocessor: true
184
+ input_size: null
185
+ target_duration: 3.0
186
+ spk2utt: dump/raw/voxceleb12_devs_sp/spk2utt
187
+ spk_num: 21615
188
+ sample_rate: 16000
189
+ num_eval: 10
190
+ rir_scp: ''
191
+ model_conf:
192
+ extract_feats_in_collect_stats: false
193
+ frontend: melspec_torch
194
+ frontend_conf:
195
+ preemp: true
196
+ n_fft: 512
197
+ log: true
198
+ win_length: 400
199
+ hop_length: 160
200
+ n_mels: 80
201
+ normalize: mn
202
+ specaug: null
203
+ specaug_conf: {}
204
+ normalize: null
205
+ normalize_conf: {}
206
+ encoder: ska_tdnn
207
+ encoder_conf:
208
+ model_scale: 8
209
+ ndim: 1024
210
+ ska_dim: 128
211
+ output_size: 1536
212
+ pooling: chn_attn_stat
213
+ pooling_conf: {}
214
+ projector: ska_tdnn
215
+ projector_conf:
216
+ output_size: 192
217
+ preprocessor: spk
218
+ preprocessor_conf:
219
+ target_duration: 3.0
220
+ sample_rate: 16000
221
+ num_eval: 5
222
+ noise_apply_prob: 0.5
223
+ noise_info:
224
+ - - 1.0
225
+ - dump/raw/musan_speech.scp
226
+ - - 4
227
+ - 7
228
+ - - 13
229
+ - 20
230
+ - - 1.0
231
+ - dump/raw/musan_noise.scp
232
+ - - 1
233
+ - 1
234
+ - - 0
235
+ - 15
236
+ - - 1.0
237
+ - dump/raw/musan_music.scp
238
+ - - 1
239
+ - 1
240
+ - - 5
241
+ - 15
242
+ rir_apply_prob: 0.5
243
+ rir_scp: dump/raw/rirs.scp
244
+ loss: aamsoftmax_sc_topk
245
+ loss_conf:
246
+ margin: 0.3
247
+ scale: 30
248
+ K: 3
249
+ mp: 0.06
250
+ k_top: 5
251
+ required:
252
+ - output_dir
253
+ version: '202310'
254
+ distributed: true
255
+ ```
256
+
257
+ </details>
258
+
259
+
260
+
261
+ ### Citing ESPnet
262
+
263
+ ```BibTex
264
+ @inproceedings{watanabe2018espnet,
265
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
266
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
267
+ year={2018},
268
+ booktitle={Proceedings of Interspeech},
269
+ pages={2207--2211},
270
+ doi={10.21437/Interspeech.2018-1456},
271
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
272
+ }
273
+
274
+
275
+
276
+
277
+
278
+
279
+ ```
280
+
281
+ or arXiv:
282
+
283
+ ```bibtex
284
+ @misc{watanabe2018espnet,
285
+ title={ESPnet: End-to-End Speech Processing Toolkit},
286
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
287
+ year={2018},
288
+ eprint={1804.00015},
289
+ archivePrefix={arXiv},
290
+ primaryClass={cs.CL}
291
+ }
292
+ ```
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: '202310'
2
+ files:
3
+ model_file: save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/9epoch.pth
4
+ python: "3.9.16 (main, Mar 8 2023, 14:00:05) \n[GCC 11.2.0]"
5
+ timestamp: 1704235279.886841
6
+ torch: 2.0.1
7
+ yaml_files:
8
+ train_config: save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/config.yaml
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/9epoch.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a653c2dc23885cb2b29f1ed397f713362ff02b0709af8ae722d3a266361cd275
3
+ size 193459364
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/RESULTS.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_spk_result.py -->
2
+ # RESULTS
3
+ ## Environments
4
+ date: 2023-12-05 14:44:39.571659
5
+
6
+ - python version: 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
7
+ - espnet version: 202310
8
+ - pytorch version: 2.0.1
9
+
10
+ | | Mean | Std |
11
+ |---|---|---|
12
+ | Target | -0.7834 | 0.1328 |
13
+ | Non-target | 0.0857 | 0.0857 |
14
+
15
+ | Model name | EER(%) | minDCF |
16
+ |---|---|---|
17
+ | conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk | 0.755 | 0.04722 |
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/config.yaml ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ drop_last_iter: true
5
+ dry_run: false
6
+ iterator_type: category
7
+ valid_iterator_type: sequence
8
+ output_dir: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp
9
+ ngpu: 1
10
+ seed: 0
11
+ num_workers: 6
12
+ num_att_plot: 0
13
+ dist_backend: nccl
14
+ dist_init_method: env://
15
+ dist_world_size: 4
16
+ dist_rank: 0
17
+ local_rank: 0
18
+ dist_master_addr: localhost
19
+ dist_master_port: 34991
20
+ dist_launcher: null
21
+ multiprocessing_distributed: true
22
+ unused_parameters: false
23
+ sharded_ddp: false
24
+ cudnn_enabled: true
25
+ cudnn_benchmark: true
26
+ cudnn_deterministic: false
27
+ collect_stats: false
28
+ write_collected_feats: false
29
+ max_epoch: 40
30
+ patience: null
31
+ val_scheduler_criterion:
32
+ - valid
33
+ - loss
34
+ early_stopping_criterion:
35
+ - valid
36
+ - loss
37
+ - min
38
+ best_model_criterion:
39
+ - - valid
40
+ - eer
41
+ - min
42
+ keep_nbest_models: 3
43
+ nbest_averaging_interval: 0
44
+ grad_clip: 9999
45
+ grad_clip_type: 2.0
46
+ grad_noise: false
47
+ accum_grad: 1
48
+ no_forward_run: false
49
+ resume: true
50
+ train_dtype: float32
51
+ use_amp: true
52
+ log_interval: 100
53
+ use_matplotlib: true
54
+ use_tensorboard: true
55
+ create_graph_in_tensorboard: false
56
+ use_wandb: false
57
+ wandb_project: null
58
+ wandb_id: null
59
+ wandb_entity: null
60
+ wandb_name: null
61
+ wandb_model_log_interval: -1
62
+ detect_anomaly: false
63
+ use_lora: false
64
+ save_lora_only: true
65
+ lora_conf: {}
66
+ pretrain_path: null
67
+ init_param: []
68
+ ignore_init_mismatch: false
69
+ freeze_param: []
70
+ num_iters_per_epoch: null
71
+ batch_size: 512
72
+ valid_batch_size: 40
73
+ batch_bins: 1000000
74
+ valid_batch_bins: null
75
+ train_shape_file:
76
+ - exp/spk_stats_16k_sp/train/speech_shape
77
+ valid_shape_file:
78
+ - exp/spk_stats_16k_sp/valid/speech_shape
79
+ batch_type: folded
80
+ valid_batch_type: null
81
+ fold_length:
82
+ - 120000
83
+ sort_in_batch: descending
84
+ shuffle_within_batch: false
85
+ sort_batch: descending
86
+ multiple_iterator: false
87
+ chunk_length: 500
88
+ chunk_shift_ratio: 0.5
89
+ num_cache_chunks: 1024
90
+ chunk_excluded_key_prefixes: []
91
+ chunk_default_fs: null
92
+ train_data_path_and_name_and_type:
93
+ - - dump/raw/voxceleb12_devs_sp/wav.scp
94
+ - speech
95
+ - sound
96
+ - - dump/raw/voxceleb12_devs_sp/utt2spk
97
+ - spk_labels
98
+ - text
99
+ valid_data_path_and_name_and_type:
100
+ - - dump/raw/voxceleb1_test/trial.scp
101
+ - speech
102
+ - sound
103
+ - - dump/raw/voxceleb1_test/trial2.scp
104
+ - speech2
105
+ - sound
106
+ - - dump/raw/voxceleb1_test/trial_label
107
+ - spk_labels
108
+ - text
109
+ allow_variable_data_keys: false
110
+ max_cache_size: 0.0
111
+ max_cache_fd: 32
112
+ allow_multi_rates: false
113
+ valid_max_cache_size: null
114
+ exclude_weight_decay: false
115
+ exclude_weight_decay_conf: {}
116
+ optim: adam
117
+ optim_conf:
118
+ lr: 0.001
119
+ weight_decay: 5.0e-05
120
+ amsgrad: false
121
+ scheduler: cosineannealingwarmuprestarts
122
+ scheduler_conf:
123
+ first_cycle_steps: 71280
124
+ cycle_mult: 1.0
125
+ max_lr: 0.001
126
+ min_lr: 5.0e-06
127
+ warmup_steps: 1000
128
+ gamma: 0.75
129
+ init: null
130
+ use_preprocessor: true
131
+ input_size: null
132
+ target_duration: 3.0
133
+ spk2utt: dump/raw/voxceleb12_devs_sp/spk2utt
134
+ spk_num: 21615
135
+ sample_rate: 16000
136
+ num_eval: 10
137
+ rir_scp: ''
138
+ model_conf:
139
+ extract_feats_in_collect_stats: false
140
+ frontend: melspec_torch
141
+ frontend_conf:
142
+ preemp: true
143
+ n_fft: 512
144
+ log: true
145
+ win_length: 400
146
+ hop_length: 160
147
+ n_mels: 80
148
+ normalize: mn
149
+ specaug: null
150
+ specaug_conf: {}
151
+ normalize: null
152
+ normalize_conf: {}
153
+ encoder: ska_tdnn
154
+ encoder_conf:
155
+ model_scale: 8
156
+ ndim: 1024
157
+ ska_dim: 128
158
+ output_size: 1536
159
+ pooling: chn_attn_stat
160
+ pooling_conf: {}
161
+ projector: ska_tdnn
162
+ projector_conf:
163
+ output_size: 192
164
+ preprocessor: spk
165
+ preprocessor_conf:
166
+ target_duration: 3.0
167
+ sample_rate: 16000
168
+ num_eval: 5
169
+ noise_apply_prob: 0.5
170
+ noise_info:
171
+ - - 1.0
172
+ - dump/raw/musan_speech.scp
173
+ - - 4
174
+ - 7
175
+ - - 13
176
+ - 20
177
+ - - 1.0
178
+ - dump/raw/musan_noise.scp
179
+ - - 1
180
+ - 1
181
+ - - 0
182
+ - 15
183
+ - - 1.0
184
+ - dump/raw/musan_music.scp
185
+ - - 1
186
+ - 1
187
+ - - 5
188
+ - 15
189
+ rir_apply_prob: 0.5
190
+ rir_scp: dump/raw/rirs.scp
191
+ loss: aamsoftmax_sc_topk
192
+ loss_conf:
193
+ margin: 0.3
194
+ scale: 30
195
+ K: 3
196
+ mp: 0.06
197
+ k_top: 5
198
+ required:
199
+ - output_dir
200
+ version: '202310'
201
+ distributed: true
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/backward_time.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/clip.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/eer.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/forward_time.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/gpu_max_cached_mem_GB.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/grad_norm.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/iter_time.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/loss.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/loss_scale.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/mindcf.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/n_trials.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/nontrg_mean.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/nontrg_std.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/optim0_lr0.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/optim_step_time.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/train_time.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/trg_mean.png ADDED
save_exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_raw_sp/images/trg_std.png ADDED