MaziyarPanahi
/

Mistral-7B-Instruct-Aya-101

@@ -1,9 +1,115 @@
 ---
 license: apache-2.0
 base_model: mistralai/Mistral-7B-Instruct-v0.2
 tags:
 - axolotl
 - generated_from_trainer
 model-index:
 - name: Mistral-7B-Instruct-KhanAcademy-v0.2
   results: []
@@ -12,6 +118,60 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 <details><summary>See axolotl config</summary>
@@ -91,58 +251,4 @@ special_tokens:
   unk_token: "<unk>"
 ```
-</details><br>
-# Mistral-7B-Instruct-KhanAcademy-v0.2
-This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 1.1502
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-06
-- train_batch_size: 2
-- eval_batch_size: 2
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 4
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 32
-- total_eval_batch_size: 8
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 10
-- num_epochs: 1
-### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 1.9039        | 0.0   | 1    | 3.1495          |
-| 0.9933        | 0.25  | 101  | 1.2402          |
-| 0.9439        | 0.5   | 202  | 1.1683          |
-| 0.9762        | 0.75  | 303  | 1.1502          |
-### Framework versions
-- Transformers 4.39.0.dev0
-- Pytorch 2.2.0+cu121
-- Datasets 2.17.0
-- Tokenizers 0.15.0

 ---
 license: apache-2.0
 base_model: mistralai/Mistral-7B-Instruct-v0.2
+datasets:
+  - CohereForAI/aya_dataset
 tags:
 - axolotl
+- mistral
+- 7b
 - generated_from_trainer
+language:
+  - afr
+  - amh
+  - ara
+  - aze
+  - bel
+  - ben
+  - bul
+  - cat
+  - ceb
+  - ces
+  - cym
+  - dan
+  - deu
+  - ell
+  - eng
+  - epo
+  - est
+  - eus
+  - fin
+  - fil
+  - fra
+  - fry
+  - gla
+  - gle
+  - glg
+  - guj
+  - hat
+  - hau
+  - heb
+  - hin
+  - hun
+  - hye
+  - ibo
+  - ind
+  - isl
+  - ita
+  - jav
+  - jpn
+  - kan
+  - kat
+  - kaz
+  - khm
+  - kir
+  - kor
+  - kur
+  - lao
+  - lav
+  - lat
+  - lit
+  - ltz
+  - mal
+  - mar
+  - mkd
+  - mlg
+  - mlt
+  - mon
+  - mri
+  - msa
+  - mya
+  - nep
+  - nld
+  - nor
+  - nso
+  - nya
+  - ory
+  - pan
+  - pes
+  - pol
+  - por
+  - pus
+  - ron
+  - rus
+  - sin
+  - slk
+  - slv
+  - smo
+  - sna
+  - snd
+  - som
+  - sot
+  - spa
+  - sqi
+  - srp
+  - sun
+  - swa
+  - swe
+  - tam
+  - tel
+  - tgk
+  - tha
+  - tur
+  - twi
+  - ukr
+  - urd
+  - uzb
+  - vie
+  - xho
+  - yid
+  - yor
+  - zho
+  - zul
 model-index:
 - name: Mistral-7B-Instruct-KhanAcademy-v0.2
   results: []
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# Mistral-7B-Instruct-KhanAcademy-v0.2
+This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1502
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 10
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.9039        | 0.0   | 1    | 3.1495          |
+| 0.9933        | 0.25  | 101  | 1.2402          |
+| 0.9439        | 0.5   | 202  | 1.1683          |
+| 0.9762        | 0.75  | 303  | 1.1502          |
+### Framework versions
+- Transformers 4.39.0.dev0
+- Pytorch 2.2.0+cu121
+- Datasets 2.17.0
+- Tokenizers 0.15.0
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 <details><summary>See axolotl config</summary>
   unk_token: "<unk>"
 ```
+</details><br>