TeeZee commited on
Commit
459e636
·
verified ·
1 Parent(s): 36fff07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,7 +1,41 @@
1
  ---
2
- tags:
3
- - merge
4
  license: other
5
  license_name: yi-license
6
  license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  license: other
3
  license_name: yi-license
4
  license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
5
+ tags:
6
+ - merge
7
+ ---
8
+ # Kyllene 34B v1.1
9
+
10
+ ![image/png](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/Kyllene_v1.1.jpg)
11
+
12
+
13
+ ## Model Details
14
+
15
+ - A result of new merge method provided by [MergeMonster](https://github.com/Gryphe/MergeMonster/) tool with extended RPG preset.
16
+ - models used for merge:
17
+ [jondurbin/bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2)
18
+ [NousResearch/Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B)
19
+ [NousResearch_Nous-Hermes-2-Yi-34B](https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B)
20
+ [SUSTech/SUS-Chat-34B](https://huggingface.co/SUSTech/SUS-Chat-34B)
21
+ - Method is aimed to maximize probability of certain phrases and minimize probablility of other phrases.
22
+ - RPG preset was extened with examples of typical, nonsensical output of most models like 'unbrekable bond', 'send shivers down he spine' etc.
23
+ - The resulting model has approximately 34 billion parameters.
24
+ - See [mergekit-config.yml](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/merge-config.yml) for details on the merge method used and RPG presets.
25
+
26
+ **Warning: This model can produce NSFW content!**
27
+
28
+ ## Results
29
+
30
+ - produces SFW nad NSFW content without issues, switches context seamlessly.
31
+ - good at following instructions
32
+ - different that [TeeZee/Kyllene-57B-v1.0](https://huggingface.co/TeeZee/Kyllene-57B-v1.0), but also surprisingly entertaining (but more tests are needed)
33
+
34
+ ## Side notes
35
+
36
+ - [MergeMonster](https://github.com/Gryphe/MergeMonster/) method works, however project would benefit greatly from some more love from developers.
37
+ - In its current state MergeMonster consumes insane amounts of RAM (256GB+) or VRAM and takes a really long time to process model data, this merge took 24H on 1xADA6000
38
+ - MergeMonster its not a golden bullet, other experiments has shown that it can also produce incredibly stupid models.
39
+
40
+ All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:
41
+ <a href="https://www.buymeacoffee.com/TeeZee" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>