Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,41 @@
|
|
1 |
---
|
2 |
-
tags:
|
3 |
-
- merge
|
4 |
license: other
|
5 |
license_name: yi-license
|
6 |
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
|
|
2 |
license: other
|
3 |
license_name: yi-license
|
4 |
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
|
5 |
+
tags:
|
6 |
+
- merge
|
7 |
+
---
|
8 |
+
# Kyllene 34B v1.1
|
9 |
+
|
10 |
+
![image/png](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/Kyllene_v1.1.jpg)
|
11 |
+
|
12 |
+
|
13 |
+
## Model Details
|
14 |
+
|
15 |
+
- A result of new merge method provided by [MergeMonster](https://github.com/Gryphe/MergeMonster/) tool with extended RPG preset.
|
16 |
+
- models used for merge:
|
17 |
+
[jondurbin/bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2)
|
18 |
+
[NousResearch/Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B)
|
19 |
+
[NousResearch_Nous-Hermes-2-Yi-34B](https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B)
|
20 |
+
[SUSTech/SUS-Chat-34B](https://huggingface.co/SUSTech/SUS-Chat-34B)
|
21 |
+
- Method is aimed to maximize probability of certain phrases and minimize probablility of other phrases.
|
22 |
+
- RPG preset was extened with examples of typical, nonsensical output of most models like 'unbrekable bond', 'send shivers down he spine' etc.
|
23 |
+
- The resulting model has approximately 34 billion parameters.
|
24 |
+
- See [mergekit-config.yml](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/merge-config.yml) for details on the merge method used and RPG presets.
|
25 |
+
|
26 |
+
**Warning: This model can produce NSFW content!**
|
27 |
+
|
28 |
+
## Results
|
29 |
+
|
30 |
+
- produces SFW nad NSFW content without issues, switches context seamlessly.
|
31 |
+
- good at following instructions
|
32 |
+
- different that [TeeZee/Kyllene-57B-v1.0](https://huggingface.co/TeeZee/Kyllene-57B-v1.0), but also surprisingly entertaining (but more tests are needed)
|
33 |
+
|
34 |
+
## Side notes
|
35 |
+
|
36 |
+
- [MergeMonster](https://github.com/Gryphe/MergeMonster/) method works, however project would benefit greatly from some more love from developers.
|
37 |
+
- In its current state MergeMonster consumes insane amounts of RAM (256GB+) or VRAM and takes a really long time to process model data, this merge took 24H on 1xADA6000
|
38 |
+
- MergeMonster its not a golden bullet, other experiments has shown that it can also produce incredibly stupid models.
|
39 |
+
|
40 |
+
All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:
|
41 |
+
<a href="https://www.buymeacoffee.com/TeeZee" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
|