Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,82 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
tags:
|
6 |
+
- story
|
7 |
+
- general usage
|
8 |
+
- roleplay
|
9 |
+
- creative
|
10 |
+
- rp
|
11 |
+
- fantasy
|
12 |
+
- story telling
|
13 |
+
- ultra high precision
|
14 |
+
---
|
15 |
+
<B>NEO CLASS Ultra Quants for : Daredevil-8B-abliterated-Ultra </B>
|
16 |
+
|
17 |
+
The NEO Class tech was created after countless investigations and over 120 lab experiments backed by
|
18 |
+
real world testing and qualitative results.
|
19 |
+
|
20 |
+
<b>NEO Class results: </b>
|
21 |
+
|
22 |
+
Better overall function, instruction following, output quality and stronger connections to ideas, concepts and the world in general.
|
23 |
+
|
24 |
+
In addition quants now operate above their "grade" so to speak :
|
25 |
+
|
26 |
+
IE: Q4 / IQ4 operate at Q5KM/Q6 levels.
|
27 |
+
|
28 |
+
Likewise for Q3/IQ3 operate at Q4KM/Q5 levels.
|
29 |
+
|
30 |
+
Perplexity drop of 724 points for Neo Class Imatrix quant of IQ4XS VS regular quant of IQ4XS.
|
31 |
+
|
32 |
+
(lower is better)
|
33 |
+
|
34 |
+
<B> A Funny thing happened on the way to the "lab" ... </b>
|
35 |
+
|
36 |
+
Although this model uses a "Llama3" template we found that Command-R's template worked better specifically for creative purposes.
|
37 |
+
|
38 |
+
This applies to both normal quants and Neo quants.
|
39 |
+
|
40 |
+
Here is Command-R's template:
|
41 |
+
|
42 |
+
<PRE>
|
43 |
+
{
|
44 |
+
"name": "Cohere Command R",
|
45 |
+
"inference_params": {
|
46 |
+
"input_prefix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>",
|
47 |
+
"input_suffix": "<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>",
|
48 |
+
"antiprompt": [
|
49 |
+
"<|START_OF_TURN_TOKEN|>",
|
50 |
+
"<|END_OF_TURN_TOKEN|>"
|
51 |
+
],
|
52 |
+
"pre_prompt_prefix": "<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>",
|
53 |
+
"pre_prompt_suffix": ""
|
54 |
+
}
|
55 |
+
}
|
56 |
+
|
57 |
+
</PRE>
|
58 |
+
|
59 |
+
This was "interesting" issue was confirmed by multiple users.
|
60 |
+
|
61 |
+
<B> Model Notes: </B>
|
62 |
+
|
63 |
+
Maximum context is 32k. Please see original model maker's page for details, and usage information for this model.
|
64 |
+
|
65 |
+
Special thanks to the model creators at MLABONNE for making such a fantastic model:
|
66 |
+
|
67 |
+
[ https://huggingface.co/mlabonne/Daredevil-8B-abliterated ]
|
68 |
+
|
69 |
+
<h3> Sample Prompt and Model's Compared:</h3>
|
70 |
+
|
71 |
+
Prompt tested with "temp=0" to ensure compliance, 2048 context (model supports 31768 context / 32k), and "chat" template for LLAMA3.
|
72 |
+
|
73 |
+
Additional parameters are also minimized.
|
74 |
+
|
75 |
+
PROMPT: <font color="red">"Start a 1000 word scene with: The sky scraper swayed, as she watched the window in front of her on the 21 floor explode..."</font>
|
76 |
+
|
77 |
+
<B> Original model IQ4XS - unaltered: </b>
|
78 |
+
|
79 |
+
|
80 |
+
<b>New NEO Class IQ4XS Imatrix: </b>
|
81 |
+
|
82 |
+
|