Saving weights and logs of step 35000 - epoch 0

Browse files

Files changed (7) hide show

README.md +4 -4
config.json +1 -1
flax_model.msgpack +1 -1
predictions/validation_clean_stortinget_no/step_35000.md +0 -0
predictions/validation_nst/step_35000.md +1 -1
runs/Jan10_10-18-43_t1v-n-a66b4568-w-7/events.out.tfevents.1704881923.t1v-n-a66b4568-w-7.91471.0.v2 +3 -0
training_state.bin +2 -2

README.md CHANGED Viewed

@@ -42,13 +42,13 @@ The following hyperparameters were used during training:
 - per_device_train_batch_size: 8
 - total_train_batch_size_per_node: 32
 - total_train_batch_size: 1024
-- total_optimization_steps: 50,000
-- starting_optimization_step: None
 - finishing_optimization_step: 50,000
 - num_train_dataset_workers: 32
 - num_hosts: 32
 - total_num_training_examples: 51,200,000
-- steps_per_epoch: 7482
 - num_beams: None
 - weight_decay: 0.01
 - adam_beta1: 0.9
@@ -69,7 +69,7 @@ The following hyperparameters were used during training:
 | 20000 | 0.4324              | 0.5187     | 2.3518             | 0.6945             | 2.9561                   | 0.7857                   | 0.7357                              | 8.9839                             | 5.7154                             | 11.8610                                  | 6.1664                                   |
 | 25000 | 0.4307              | 0.5158     | 2.3028             | 0.6712             | 2.9343                   | 0.7711                   | 0.7228                              | 9.1284                             | 5.8704                             | 11.9915                                  | 6.3161                                   |
 | 30000 | 0.4312              | 0.5108     | 2.2810             | 0.6656             | 2.8690                   | 0.7564                   | 0.7428                              | 8.9010                             | 5.6726                             | 11.8349                                  | 6.1305                                   |
-| 35000 | 0.4294              | 0.4957     | 2.2375             | 0.6870             | 2.8417                   | 0.7821                   | 0.7530                              | 8.9294                             | 5.6666                             | 11.7732                                  | 6.1110                                   |
 ### Framework versions

 - per_device_train_batch_size: 8
 - total_train_batch_size_per_node: 32
 - total_train_batch_size: 1024
+- total_optimization_steps: 15,000
+- starting_optimization_step: 35,000
 - finishing_optimization_step: 50,000
 - num_train_dataset_workers: 32
 - num_hosts: 32
 - total_num_training_examples: 51,200,000
+- steps_per_epoch: _To be computed after first epoch_
 - num_beams: None
 - weight_decay: 0.01
 - adam_beta1: 0.9
 | 20000 | 0.4324              | 0.5187     | 2.3518             | 0.6945             | 2.9561                   | 0.7857                   | 0.7357                              | 8.9839                             | 5.7154                             | 11.8610                                  | 6.1664                                   |
 | 25000 | 0.4307              | 0.5158     | 2.3028             | 0.6712             | 2.9343                   | 0.7711                   | 0.7228                              | 9.1284                             | 5.8704                             | 11.9915                                  | 6.3161                                   |
 | 30000 | 0.4312              | 0.5108     | 2.2810             | 0.6656             | 2.8690                   | 0.7564                   | 0.7428                              | 8.9010                             | 5.6726                             | 11.8349                                  | 6.1305                                   |
+| 35000 | 0.4299              | 0.4908     | 2.2320             | 0.6768             | 2.8417                   | 0.7729                   | 0.7513                              | 8.8015                             | 5.6123                             | 11.6854                                  | 6.0642                                   |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "NbAiLab/nb-whisper-large-v3-RC4",
   "activation_dropout": 0.1,
   "activation_function": "gelu",
   "apply_spec_augment": false,

 {
+  "_name_or_path": "../../../nb-whisper-large-v0.8-vad3",
   "activation_dropout": 0.1,
   "activation_function": "gelu",
   "apply_spec_augment": false,

flax_model.msgpack CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5587cc4b2c6d432ebe48d8bb87232e0db65b5c8be8ff87ceb93b0d6b4a9f505d
 size 3087027463

 version https://git-lfs.github.com/spec/v1
+oid sha256:9edd3aeb5e9d5f0c7d1e5cd2a15e321917844bf963edba9fb3486ca628b9d129
 size 3087027463

predictions/validation_clean_stortinget_no/step_35000.md CHANGED Viewed

The diff for this file is too large to render. See raw diff

predictions/validation_nst/step_35000.md CHANGED Viewed

@@ -2,7 +2,7 @@
 | STEP| loss | wer |cer|
 | ---| --- | --- |--- |
-| **35000**| 0.429 | 2.237 |0.687 |
 | target                                                                                                                                                                         | prediction                                                                                                                                                                   |
 |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

 | STEP| loss | wer |cer|
 | ---| --- | --- |--- |
+| **35000**| 0.430 | 2.232 |0.677 |
 | target                                                                                                                                                                         | prediction                                                                                                                                                                   |
 |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

runs/Jan10_10-18-43_t1v-n-a66b4568-w-7/events.out.tfevents.1704881923.t1v-n-a66b4568-w-7.91471.0.v2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:069b7c00385cca4e48d3949c81b3583fa86c5acf98c71837bb15d7c2df69939b
+size 159483

training_state.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7f61e85aae06c429c1a19c9567352b4384613e81df264087cefef85265eb2aea
-size 4608

 version https://git-lfs.github.com/spec/v1
+oid sha256:8478fa617c3e9090af8312175443281b9a6198473ff55fea68cf25576da418c5
+size 4612