ninyx commited on
Commit
efa89aa
1 Parent(s): 9608b7e

Model save

Browse files
Files changed (2) hide show
  1. README.md +74 -0
  2. adapter_model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - trl
6
+ - sft
7
+ - generated_from_trainer
8
+ base_model: mistralai/Mistral-7B-Instruct-v0.2
9
+ datasets:
10
+ - generator
11
+ metrics:
12
+ - bleu
13
+ - rouge
14
+ model-index:
15
+ - name: Mistral-7B-Instruct-v0.2-advisegpt-v0.5
16
+ results: []
17
+ ---
18
+
19
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
+ should probably proofread and complete it, then remove this comment. -->
21
+
22
+ # Mistral-7B-Instruct-v0.2-advisegpt-v0.5
23
+
24
+ This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the generator dataset.
25
+ It achieves the following results on the evaluation set:
26
+ - Loss: 0.0840
27
+ - Bleu: {'bleu': 0.9537910015397628, 'precisions': [0.9763005593772222, 0.9591762297332277, 0.9471223357463351, 0.9370695448087227], 'brevity_penalty': 0.9989373668126428, 'length_ratio': 0.9989379310075293, 'translation_length': 1022387, 'reference_length': 1023474}
28
+ - Rouge: {'rouge1': 0.9741038510844018, 'rouge2': 0.9550445541823809, 'rougeL': 0.9723656951648176, 'rougeLsum': 0.9736935611588988}
29
+ - Exact Match: {'exact_match': 0.0}
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 2e-05
49
+ - train_batch_size: 3
50
+ - eval_batch_size: 1
51
+ - seed: 42
52
+ - gradient_accumulation_steps: 10
53
+ - total_train_batch_size: 30
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: cosine
56
+ - num_epochs: 3
57
+ - mixed_precision_training: Native AMP
58
+
59
+ ### Training results
60
+
61
+ | Training Loss | Epoch | Step | Bleu | Exact Match | Validation Loss | Rouge |
62
+ |:-------------:|:------:|:----:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------------------:|:---------------:|:---------------------------------------------------------------------------------------------------------------------------:|
63
+ | 0.069 | 0.9999 | 829 | {'bleu': 0.9459206747141892, 'brevity_penalty': 0.998656611989374, 'length_ratio': 0.9986575135274565, 'precisions': [0.9726768417963018, 0.9521542081327253, 0.9380288226144853, 0.9265355643009697], 'reference_length': 1023474, 'translation_length': 1022100} | {'exact_match': 0.0} | 0.0990 | {'rouge1': 0.9702189356306301, 'rouge2': 0.9472171244648081, 'rougeL': 0.9677029434775739, 'rougeLsum': 0.9695684693436178} |
64
+ | 0.0501 | 1.9999 | 1658 | {'bleu': 0.9537910015397628, 'brevity_penalty': 0.9989373668126428, 'length_ratio': 0.9989379310075293, 'precisions': [0.9763005593772222, 0.9591762297332277, 0.9471223357463351, 0.9370695448087227], 'reference_length': 1023474, 'translation_length': 1022387} | {'exact_match': 0.0} | 0.0840 | {'rouge1': 0.9741105562035488, 'rouge2': 0.9550205654651982, 'rougeL': 0.9723363685950056, 'rougeLsum': 0.9737013621980013} |
65
+ | 0.0479 | 2.9999 | 2487 | 0.0850 | {'bleu': 0.9548514568958526, 'precisions': [0.9767648848122783, 0.9601353822381405, 0.9483682511725553, 0.9385079979703334], 'brevity_penalty': 0.9989676875356347, 'length_ratio': 0.9989682200036347, 'translation_length': 1022418, 'reference_length': 1023474}| {'rouge1': 0.9746456572659052, 'rouge2': 0.9560608145101823, 'rougeL': 0.9729518327172596, 'rougeLsum': 0.9742472834405176}| {'exact_match': 0.0} |
66
+
67
+
68
+ ### Framework versions
69
+
70
+ - PEFT 0.10.0
71
+ - Transformers 4.40.2
72
+ - Pytorch 2.3.0+cu121
73
+ - Datasets 2.19.1
74
+ - Tokenizers 0.19.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db463ba6f7ef15e816ecfdf26d1c02fe7152079e8a985ce8a52d510e3a9fc678
3
  size 872450448
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:780a8185b44ae3cec05448cacf336e524b251c6feccebc5c06ceea170a8520b2
3
  size 872450448