BEE-spoke-data
/

verysmol_llama-v11-KIx2

+---
+license: apache-2.0
+base_model: pszemraj/verysmol_llama-v10-rw3m_dd
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: verysmol_llama-v10-rw3m_dd-knowledge-inoc-concat-v1-vN
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# verysmol_llama-v10-rw3m_dd-knowledge-inoc-concat-v1-vN
+This model is a fine-tuned version of [pszemraj/verysmol_llama-v10-rw3m_dd](https://huggingface.co/pszemraj/verysmol_llama-v10-rw3m_dd) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.8876
+- Accuracy: 0.4502
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.00014
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 17514
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-06
+- lr_scheduler_type: inverse_sqrt
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 2.0
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 3.0681        | 0.03  | 150  | 3.0689          | 0.4259   |
+| 3.0113        | 0.07  | 300  | 3.0433          | 0.4278   |
+| 2.9468        | 0.1   | 450  | 3.0362          | 0.4288   |
+| 3.0162        | 0.13  | 600  | 3.0148          | 0.4326   |
+| 2.9531        | 0.17  | 750  | 3.0012          | 0.4341   |
+| 2.9282        | 0.2   | 900  | 2.9923          | 0.4358   |
+| 2.9485        | 0.23  | 1050 | 2.9845          | 0.4357   |
+| 2.9365        | 0.27  | 1200 | 2.9749          | 0.4375   |
+| 2.8875        | 0.3   | 1350 | 2.9652          | 0.4391   |
+| 2.8874        | 0.33  | 1500 | 2.9619          | 0.4402   |
+| 2.8733        | 0.37  | 1650 | 2.9574          | 0.4408   |
+| 2.8541        | 0.4   | 1800 | 2.9536          | 0.4403   |
+| 2.8958        | 0.43  | 1950 | 2.9491          | 0.4414   |
+| 2.8404        | 0.47  | 2100 | 2.9434          | 0.4427   |
+| 2.8635        | 0.5   | 2250 | 2.9404          | 0.4425   |
+| 2.9031        | 0.53  | 2400 | 2.9369          | 0.4428   |
+| 2.8237        | 0.57  | 2550 | 2.9330          | 0.4440   |
+| 2.832         | 0.6   | 2700 | 2.9318          | 0.4444   |
+| 2.8566        | 0.63  | 2850 | 2.9305          | 0.4450   |
+| 2.8817        | 0.67  | 3000 | 2.9286          | 0.4443   |
+| 2.8733        | 0.7   | 3150 | 2.9268          | 0.4442   |
+| 2.8009        | 0.73  | 3300 | 2.9227          | 0.4457   |
+| 2.9292        | 0.77  | 3450 | 2.9229          | 0.4450   |
+| 2.8562        | 0.8   | 3600 | 2.9193          | 0.4456   |
+| 2.8441        | 0.83  | 3750 | 2.9188          | 0.4460   |
+| 2.904         | 0.87  | 3900 | 2.9171          | 0.4458   |
+| 2.857         | 0.9   | 4050 | 2.9140          | 0.4461   |
+| 2.8344        | 0.93  | 4200 | 2.9134          | 0.4467   |
+| 2.8382        | 0.97  | 4350 | 2.9122          | 0.4467   |
+| 2.8227        | 1.0   | 4500 | 2.9104          | 0.4468   |
+| 2.8121        | 1.03  | 4650 | 2.9099          | 0.4472   |
+| 2.8127        | 1.07  | 4800 | 2.9082          | 0.4473   |
+| 2.8013        | 1.1   | 4950 | 2.9084          | 0.4478   |
+| 2.7983        | 1.14  | 5100 | 2.9069          | 0.4474   |
+| 2.811         | 1.17  | 5250 | 2.9076          | 0.4480   |
+| 2.7807        | 1.2   | 5400 | 2.9065          | 0.4471   |
+| 2.8512        | 1.24  | 5550 | 2.9056          | 0.4483   |
+| 2.8146        | 1.27  | 5700 | 2.9049          | 0.4478   |
+| 2.8101        | 1.3   | 5850 | 2.9024          | 0.4482   |
+| 2.7968        | 1.34  | 6000 | 2.9005          | 0.4484   |
+| 2.8197        | 1.37  | 6150 | 2.9001          | 0.4481   |
+| 2.8035        | 1.4   | 6300 | 2.8997          | 0.4488   |
+| 2.7905        | 1.44  | 6450 | 2.8996          | 0.4488   |
+| 2.8239        | 1.47  | 6600 | 2.8982          | 0.4487   |
+| 2.8579        | 1.5   | 6750 | 2.8975          | 0.4492   |
+| 2.7996        | 1.54  | 6900 | 2.8960          | 0.4492   |
+| 2.8337        | 1.57  | 7050 | 2.8984          | 0.4490   |
+| 2.8087        | 1.6   | 7200 | 2.8959          | 0.4492   |
+| 2.8066        | 1.64  | 7350 | 2.8952          | 0.4499   |
+| 2.7991        | 1.67  | 7500 | 2.8950          | 0.4492   |
+| 2.8215        | 1.7   | 7650 | 2.8943          | 0.4496   |
+| 2.7714        | 1.74  | 7800 | 2.8914          | 0.4501   |
+| 2.8132        | 1.77  | 7950 | 2.8913          | 0.4500   |
+| 2.8505        | 1.8   | 8100 | 2.8906          | 0.4502   |
+| 2.8294        | 1.84  | 8250 | 2.8901          | 0.4502   |
+| 2.7977        | 1.87  | 8400 | 2.8891          | 0.4499   |
+| 2.7501        | 1.9   | 8550 | 2.8878          | 0.4505   |
+| 2.8038        | 1.94  | 8700 | 2.8883          | 0.4504   |
+| 2.7547        | 1.97  | 8850 | 2.8876          | 0.4502   |
+### Framework versions
+- Transformers 4.33.3
+- Pytorch 2.2.0.dev20231017+cu121
+- Datasets 2.14.5
+- Tokenizers 0.13.3

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "transformers_version": "4.33.3"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2ccd44c69fea4c1d2beba09a7087eb6f82aabab5a1d6f5657cf1951ce46bc565
 size 232292512

 version https://git-lfs.github.com/spec/v1
+oid sha256:2ea330c0052600266e95d7982e8edc79a84647cdaab554d3785e81cb8376b8dc
 size 232292512