End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,4 +1,6 @@
 ---
 base_model: pszemraj/MiniLMv2-L6-H384_R-simplewiki
 tags:
 - generated_from_trainer
@@ -14,11 +16,11 @@ should probably proofread and complete it, then remove this comment. -->
 # MiniLMv2-L6-H384_R-simplewiki-fineweb-100k_en-med_512-vN
-This model is a fine-tuned version of [pszemraj/MiniLMv2-L6-H384_R-simplewiki](https://huggingface.co/pszemraj/MiniLMv2-L6-H384_R-simplewiki) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.0352
-- Accuracy: 0.3774
-- Num Input Tokens Seen: 157285376
 ## Model description

 ---
+language:
+- en
 base_model: pszemraj/MiniLMv2-L6-H384_R-simplewiki
 tags:
 - generated_from_trainer
 # MiniLMv2-L6-H384_R-simplewiki-fineweb-100k_en-med_512-vN
+This model is a fine-tuned version of [pszemraj/MiniLMv2-L6-H384_R-simplewiki](https://huggingface.co/pszemraj/MiniLMv2-L6-H384_R-simplewiki) on the BEE-spoke-data/fineweb-100k_en-med dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.0206
+- Accuracy: 0.3783
+- Num Input Tokens Seen: 162790400
 ## Model description

all_results.json ADDED Viewed

+{
+    "epoch": 1.9998993609419817,
+    "eval_accuracy": 0.37833401056481103,
+    "eval_loss": 4.020592212677002,
+    "eval_runtime": 6.7839,
+    "eval_samples": 300,
+    "eval_samples_per_second": 44.223,
+    "eval_steps_per_second": 5.602,
+    "num_input_tokens_seen": 162790400,
+    "perplexity": 55.734102497348196,
+    "total_flos": 1.059416318592e+16,
+    "train_loss": 4.377665276304727,
+    "train_runtime": 2972.3443,
+    "train_samples": 158982,
+    "train_samples_per_second": 106.974,
+    "train_steps_per_second": 0.836,
+    "train_tokens_per_second": 54770.764
+}

eval_results.json ADDED Viewed

+{
+    "epoch": 1.9998993609419817,
+    "eval_accuracy": 0.37833401056481103,
+    "eval_loss": 4.020592212677002,
+    "eval_runtime": 6.7839,
+    "eval_samples": 300,
+    "eval_samples_per_second": 44.223,
+    "eval_steps_per_second": 5.602,
+    "num_input_tokens_seen": 162790400,
+    "perplexity": 55.734102497348196
+}

train_results.json ADDED Viewed

+{
+    "epoch": 1.9998993609419817,
+    "num_input_tokens_seen": 162790400,
+    "total_flos": 1.059416318592e+16,
+    "train_loss": 4.377665276304727,
+    "train_runtime": 2972.3443,
+    "train_samples": 158982,
+    "train_samples_per_second": 106.974,
+    "train_steps_per_second": 0.836,
+    "train_tokens_per_second": 54770.764
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff