Darkhn FluffyKaeloky commited on
Commit
666d21f
1 Parent(s): af2dfd3

Update README.md (#1)

Browse files

- Update README.md (46a88328f5d4ae90c5e21ae4aa47a0ba5803e745)


Co-authored-by: FluffyKaeloky <FluffyKaeloky@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +49 -46
README.md CHANGED
@@ -1,47 +1,50 @@
1
- ---
2
- license: other
3
- license_name: mrl
4
- language:
5
- - en
6
- tags:
7
- - chat
8
- pipeline_tag: text-generation
9
-
10
- library_name: transformers
11
- ---
12
- # Monstral 123B v2
13
- A Mistral-Large merge
14
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/sf_mh-yR7V7ghi7M8UnPS.png)
15
-
16
- This model is a hybrid merge of Behemoth 1.2, Tess, and Magnum V4. The intention was to do a three-way slerp merge, which is technically
17
- not possible. To simulate the effeect of a menage-a-slerp, I slerped B1.2 with tess, then separately did B1.2 with magnum. I then did a
18
- model stock merge of those two slerps using B1.2 as the base. Somehow, it worked out spectacularly well. Sometimes dumb ideas pay off.
19
-
20
- Mergefuel:
21
- - TheDrummer/Behemoth-123B-v1.2
22
- - anthracite-org/magnum-v4-123b
23
- - migtissera/Tess-3-Mistral-Large-2-123B
24
-
25
- See recipe.txt for full details.
26
-
27
- Improvements over Monstral v1: Drummer's 1.2 tune of behemoth is a marked improvement over the original, and the addition ot tess to the
28
- mix really makes the creativity pop. I seem to have dialed out the rapey magnum influence, without stripping it of the ability to get mean
29
- and/or dirty when the situation actually calls for it. The RP output of this model shows a lot more flowery and "literary" description of
30
- scenes and activities. It's more colorful and vibrant. Repitition is dramatically reduced, as is slop (though to a lesser extent). The
31
- annoying tendency to double-describe things with "it was X, almost Y" is virtually gone. Do you like a slow-burn story that builds over
32
- time? Well good fucking news, because v2 excels at that.
33
-
34
- The only complaint I've received is occasional user impersonation with certain cards. I've not seen this myself on any of my cards, so I
35
- have to assume it's down to the specific formatting on specific cards. I don't want to say it's a skill issue, but...
36
-
37
- This model is uncensored and perfectly capable of generating objectionable material. I have not observed it injecting NSFW content into
38
- SFW scenarios, but no guarentees can be made. As with any LLM, no factual claims made by the model should be taken at face value. You
39
- know that boilerplate safety disclaimer that most professional models have? Assume this has it too. This model is for entertainment
40
- purposes only.
41
-
42
- GGUFs: https://huggingface.co/MarsupialAI/Monstral-123B-v2_GGUF
43
-
44
-
45
- # Prompt Format
46
- Metharme seems to work flawlessly. In theory, mistral V3 or possibly even chatml should work to some extent, but meth was providing such
 
 
 
47
  high quality output that I couldn't even be bothered to test the others. Just do meth, kids.
 
1
+ ---
2
+ license: other
3
+ license_name: mrl
4
+ language:
5
+ - en
6
+ tags:
7
+ - chat
8
+ pipeline_tag: text-generation
9
+ library_name: transformers
10
+ base_model:
11
+ - MarsupialAI/Monstral-123B-v2
12
+ base_model_relation: quantized
13
+ quantized_by: Darkhn
14
+ ---
15
+ # Monstral 123B v2
16
+ A Mistral-Large merge
17
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/sf_mh-yR7V7ghi7M8UnPS.png)
18
+
19
+ This model is a hybrid merge of Behemoth 1.2, Tess, and Magnum V4. The intention was to do a three-way slerp merge, which is technically
20
+ not possible. To simulate the effeect of a menage-a-slerp, I slerped B1.2 with tess, then separately did B1.2 with magnum. I then did a
21
+ model stock merge of those two slerps using B1.2 as the base. Somehow, it worked out spectacularly well. Sometimes dumb ideas pay off.
22
+
23
+ Mergefuel:
24
+ - TheDrummer/Behemoth-123B-v1.2
25
+ - anthracite-org/magnum-v4-123b
26
+ - migtissera/Tess-3-Mistral-Large-2-123B
27
+
28
+ See recipe.txt for full details.
29
+
30
+ Improvements over Monstral v1: Drummer's 1.2 tune of behemoth is a marked improvement over the original, and the addition ot tess to the
31
+ mix really makes the creativity pop. I seem to have dialed out the rapey magnum influence, without stripping it of the ability to get mean
32
+ and/or dirty when the situation actually calls for it. The RP output of this model shows a lot more flowery and "literary" description of
33
+ scenes and activities. It's more colorful and vibrant. Repitition is dramatically reduced, as is slop (though to a lesser extent). The
34
+ annoying tendency to double-describe things with "it was X, almost Y" is virtually gone. Do you like a slow-burn story that builds over
35
+ time? Well good fucking news, because v2 excels at that.
36
+
37
+ The only complaint I've received is occasional user impersonation with certain cards. I've not seen this myself on any of my cards, so I
38
+ have to assume it's down to the specific formatting on specific cards. I don't want to say it's a skill issue, but...
39
+
40
+ This model is uncensored and perfectly capable of generating objectionable material. I have not observed it injecting NSFW content into
41
+ SFW scenarios, but no guarentees can be made. As with any LLM, no factual claims made by the model should be taken at face value. You
42
+ know that boilerplate safety disclaimer that most professional models have? Assume this has it too. This model is for entertainment
43
+ purposes only.
44
+
45
+ GGUFs: https://huggingface.co/MarsupialAI/Monstral-123B-v2_GGUF
46
+
47
+
48
+ # Prompt Format
49
+ Metharme seems to work flawlessly. In theory, mistral V3 or possibly even chatml should work to some extent, but meth was providing such
50
  high quality output that I couldn't even be bothered to test the others. Just do meth, kids.