artificialguybr commited on
Commit
211d86c
1 Parent(s): adb42c8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - de
5
+ - fr
6
+ - it
7
+ - pt
8
+ - hi
9
+ - es
10
+ - th
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
+ license: llama3.2
14
+ base_model: NousResearch/Llama-3.2-1B
15
+ tags:
16
+ - generated_from_trainer
17
+ - facebook
18
+ - meta
19
+ - pytorch
20
+ - llama
21
+ - llama-3
22
+ model-index:
23
+ - name: llama3.2-1b-synthia-i
24
+ results: []
25
+ ---
26
+
27
+ # Llama 3.2 1B - Synthia-v1.5-I - Redmond - Fine-tuned Model
28
+
29
+ This model is a fine-tuned version of [NousResearch/Llama-3.2-1B](https://huggingface.co/NousResearch/Llama-3.2-1B) on the [Synthia-v1.5-I](https://huggingface.co/datasets/migtissera/Synthia-v1.5-I) dataset.
30
+
31
+ Thanks [RedmondAI](https://redmond.ai) for all the GPU Support!
32
+
33
+ ## Model Description
34
+
35
+ The base model is Llama 3.2 1B, a multilingual large language model developed by Meta. This version has been fine-tuned on the Synthia-v1.5-I instruction dataset to improve its instruction-following capabilities.
36
+
37
+ ### Training Data
38
+
39
+ The model was fine-tuned on Synthia-v1.5-I, which contains:
40
+ - 20.7k training examples
41
+
42
+
43
+ ### Training Procedure
44
+
45
+ The model was trained with the following hyperparameters:
46
+ - Learning rate: 2e-05
47
+ - Train batch size: 1
48
+ - Eval batch size: 1
49
+ - Seed: 42
50
+ - Gradient accumulation steps: 8
51
+ - Total train batch size: 8
52
+ - Optimizer: Paged AdamW 8bit (betas=(0.9,0.999), epsilon=1e-08)
53
+ - LR scheduler: Cosine with 100 warmup steps
54
+ - Number of epochs: 3
55
+
56
+ ### Framework Versions
57
+ - Transformers 4.46.1
58
+ - Pytorch 2.3.1+cu121
59
+ - Datasets 3.0.1
60
+ - Tokenizers 0.20.3
61
+
62
+ ## Intended Use
63
+
64
+ This model is intended for:
65
+ - Instruction following tasks
66
+ - Conversational AI applications
67
+ - Research and development in natural language processing
68
+
69
+ ## Training Infrastructure
70
+
71
+ The model was trained using the Axolotl framework version 0.5.0.
72
+
73
+
74
+ ## License
75
+
76
+ This model is subject to the Llama 3.2 Community License Agreement. Users must comply with all terms and conditions specified in the license.
77
+
78
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)