MarsupialAI
commited on
Commit
•
451c6f3
1
Parent(s):
99a9d59
Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,6 @@ tags:
|
|
9 |
- storywriting
|
10 |
---
|
11 |
# Kitchen Sink 103b
|
12 |
-
|
13 |
-
|
14 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/QFmPxADHAqMf3Wb_Xt1ry.jpeg)
|
15 |
|
16 |
This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b. The result is a large model that contains a little bit of everything - including the kitchen sink.
|
@@ -24,13 +22,15 @@ Components of those models are purported to include: Nous-Hermes-Llama2-70b, Xwi
|
|
24 |
|
25 |
|
26 |
# Sample output
|
27 |
-
|
28 |
Storywriting
|
29 |
```
|
30 |
Sample goes here
|
31 |
```
|
32 |
|
33 |
|
34 |
-
#
|
|
|
|
|
35 |
|
|
|
36 |
Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks. That is what I did here, creating three stacked merges with the three source models, and then doing a 1:1:1 linear merge of all three stacks. The exact merge configs can be found in the recipe.txt file.
|
|
|
9 |
- storywriting
|
10 |
---
|
11 |
# Kitchen Sink 103b
|
|
|
|
|
12 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/QFmPxADHAqMf3Wb_Xt1ry.jpeg)
|
13 |
|
14 |
This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b. The result is a large model that contains a little bit of everything - including the kitchen sink.
|
|
|
22 |
|
23 |
|
24 |
# Sample output
|
|
|
25 |
Storywriting
|
26 |
```
|
27 |
Sample goes here
|
28 |
```
|
29 |
|
30 |
|
31 |
+
# Prompt format
|
32 |
+
Seems to have the strongest affinity for Alpaca format, but Vicuna works as well. Considering the variety of components, most formats will probbaly work to some extent.
|
33 |
+
|
34 |
|
35 |
+
# WTF is a rotating-stack merge?
|
36 |
Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks. That is what I did here, creating three stacked merges with the three source models, and then doing a 1:1:1 linear merge of all three stacks. The exact merge configs can be found in the recipe.txt file.
|