DavidAU commited on
Commit
b7eae1b
·
verified ·
1 Parent(s): 1b3b0a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -77
README.md CHANGED
@@ -1,77 +1,81 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # MN-12B-Celeste-V1.9-Instruct
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the passthrough merge method.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * g:/11b/Mistral-Nemo-Instruct-2407-12B
22
- * G:/11B/MN-12B-Celeste-V1.9
23
-
24
- ### Configuration
25
-
26
- The following YAML configuration was used to produce this model:
27
-
28
- ```yaml
29
- # SMB with instruct to help performance.
30
-
31
- slices:
32
- - sources:
33
- - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
34
- layer_range: [0, 14]
35
- - sources:
36
- - model: G:/11B/MN-12B-Celeste-V1.9
37
- layer_range: [8, 24]
38
- parameters:
39
- scale:
40
- - filter: o_proj
41
- value: 1
42
- - filter: down_proj
43
- value: 1
44
- - value: 1
45
- - sources:
46
- - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
47
- layer_range: [14, 22]
48
- parameters:
49
- scale:
50
- - filter: o_proj
51
- value: .5
52
- - filter: down_proj
53
- value: .5
54
- - value: 1
55
- - sources:
56
- - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
57
- layer_range: [22, 31]
58
- parameters:
59
- scale:
60
- - filter: o_proj
61
- value: .75
62
- - filter: down_proj
63
- value: .75
64
- - value: 1
65
- - sources:
66
- - model: G:/11B/MN-12B-Celeste-V1.9
67
- layer_range: [24, 40]
68
- parameters:
69
- scale:
70
- - filter: o_proj
71
- value: 1
72
- - filter: down_proj
73
- value: 1
74
- - value: 1
75
- merge_method: passthrough
76
- dtype: bfloat16
77
- ```
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # MN-12B-Celeste-V1.9-Instruct
10
+
11
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ Full model card and GGUF files are here:
14
+
15
+ [ https://huggingface.co/DavidAU/MN-18.5B-Celeste-V1.9-Story-Wizard-ED1-Instruct-GGUF ]
16
+
17
+ ## Merge Details
18
+ ### Merge Method
19
+
20
+ This model was merged using the passthrough merge method.
21
+
22
+ ### Models Merged
23
+
24
+ The following models were included in the merge:
25
+ * g:/11b/Mistral-Nemo-Instruct-2407-12B
26
+ * G:/11B/MN-12B-Celeste-V1.9
27
+
28
+ ### Configuration
29
+
30
+ The following YAML configuration was used to produce this model:
31
+
32
+ ```yaml
33
+ # SMB with instruct to help performance.
34
+
35
+ slices:
36
+ - sources:
37
+ - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
38
+ layer_range: [0, 14]
39
+ - sources:
40
+ - model: G:/11B/MN-12B-Celeste-V1.9
41
+ layer_range: [8, 24]
42
+ parameters:
43
+ scale:
44
+ - filter: o_proj
45
+ value: 1
46
+ - filter: down_proj
47
+ value: 1
48
+ - value: 1
49
+ - sources:
50
+ - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
51
+ layer_range: [14, 22]
52
+ parameters:
53
+ scale:
54
+ - filter: o_proj
55
+ value: .5
56
+ - filter: down_proj
57
+ value: .5
58
+ - value: 1
59
+ - sources:
60
+ - model: g:/11b/Mistral-Nemo-Instruct-2407-12B
61
+ layer_range: [22, 31]
62
+ parameters:
63
+ scale:
64
+ - filter: o_proj
65
+ value: .75
66
+ - filter: down_proj
67
+ value: .75
68
+ - value: 1
69
+ - sources:
70
+ - model: G:/11B/MN-12B-Celeste-V1.9
71
+ layer_range: [24, 40]
72
+ parameters:
73
+ scale:
74
+ - filter: o_proj
75
+ value: 1
76
+ - filter: down_proj
77
+ value: 1
78
+ - value: 1
79
+ merge_method: passthrough
80
+ dtype: bfloat16
81
+ ```