InferenceIllusionist
commited on
Update README.md
Browse files* Added mergekit_configs for the 2 merge steps prior to the final SLERP for transparency.
* QOL: Added links to other models in card itself for fast reference
README.md
CHANGED
@@ -67,13 +67,57 @@ This model was merged using the SLERP merge method.
|
|
67 |
### Models Merged
|
68 |
|
69 |
The following models were included in the merge:
|
70 |
-
*
|
71 |
-
*
|
|
|
|
|
|
|
72 |
|
73 |
### Configuration
|
74 |
|
75 |
-
The following YAML
|
76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
```yaml
|
78 |
slices:
|
79 |
- sources:
|
@@ -81,10 +125,6 @@ slices:
|
|
81 |
layer_range: [0, 32]
|
82 |
- model: models/merliniteX-blockB1
|
83 |
layer_range: [0, 32]
|
84 |
-
# or, the equivalent models: syntax:
|
85 |
-
# models:
|
86 |
-
# - model: psmathur/orca_mini_v3_13b
|
87 |
-
# - model: garage-bAInd/Platypus2-13B
|
88 |
merge_method: slerp
|
89 |
base_model: models/merliniteX-blockF2
|
90 |
parameters:
|
|
|
67 |
### Models Merged
|
68 |
|
69 |
The following models were included in the merge:
|
70 |
+
* [ibm/merlinite-7b](https://huggingface.co/ibm/merlinite-7b)
|
71 |
+
* [InferenceIllusionist/Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b)
|
72 |
+
* [SanjiWatsuki/Kunoichi-DPO-v2-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-DPO-v2-7B)
|
73 |
+
* [mlabonne/Monarch-7B](https://huggingface.co/mlabonne/Monarch-7B)
|
74 |
+
* [bardsai/jaskier-7b-dpo-v6.1](https://huggingface.co/bardsai/jaskier-7b-dpo-v6.1)
|
75 |
|
76 |
### Configuration
|
77 |
|
78 |
+
The following YAML configurations were used to produce this model:
|
79 |
|
80 |
+
|
81 |
+
<b>merliniteX-blockB1</b>
|
82 |
+
```yaml
|
83 |
+
models:
|
84 |
+
- model: models/merlinite-7b
|
85 |
+
parameters:
|
86 |
+
weight: 1.0
|
87 |
+
- model: models/Kunoichi-DPO-v2-7B
|
88 |
+
parameters:
|
89 |
+
weight: 0.2
|
90 |
+
- model: models/jaskier-7b-dpo-v6.1
|
91 |
+
parameters:
|
92 |
+
weight: 0.6
|
93 |
+
- model: models/Monarch-7b
|
94 |
+
parameters:
|
95 |
+
weight: 0.4
|
96 |
+
merge_method: linear
|
97 |
+
dtype: float16
|
98 |
+
```
|
99 |
+
|
100 |
+
<b>merliniteX-blockF2</b>
|
101 |
+
```yaml
|
102 |
+
slices:
|
103 |
+
- sources:
|
104 |
+
- model: models/Magic-Dolphin-7b
|
105 |
+
layer_range: [0, 32]
|
106 |
+
- model: models/jaskier-7b-dpo-v6.1
|
107 |
+
layer_range: [0, 32]
|
108 |
+
merge_method: slerp
|
109 |
+
base_model: models/Magic-Dolphin-7b
|
110 |
+
parameters:
|
111 |
+
t:
|
112 |
+
- filter: self_attn
|
113 |
+
value: [0, 0.5, 0.3, 0.7, 0.5, 1]
|
114 |
+
- filter: mlp
|
115 |
+
value: [1, 0.5, 0.7, 0.3, 0.5, 0]
|
116 |
+
- value: 0.5 # fallback for rest of tensors
|
117 |
+
dtype: float16
|
118 |
+
```
|
119 |
+
|
120 |
+
<b>merliniteX-blockH1 (Excalibur-7b)</b>
|
121 |
```yaml
|
122 |
slices:
|
123 |
- sources:
|
|
|
125 |
layer_range: [0, 32]
|
126 |
- model: models/merliniteX-blockB1
|
127 |
layer_range: [0, 32]
|
|
|
|
|
|
|
|
|
128 |
merge_method: slerp
|
129 |
base_model: models/merliniteX-blockF2
|
130 |
parameters:
|