Update README.md
Browse files
README.md
CHANGED
@@ -12,6 +12,9 @@ This is significant enough to encourage you folks to test them, and provide feed
|
|
12 |
The iMatrix I use is based on Group Merged V3 and enriched with a bit of French,
|
13 |
a bit of Serbian, and a bit of Croatian languages.
|
14 |
|
|
|
|
|
|
|
15 |
|
16 |
ARC and PPL-512 DATA (Get the last data on the main post of the PR thread) :
|
17 |
|
@@ -20,9 +23,13 @@ IQ1_XS - Unusable on <30B models
|
|
20 |
PR
|
21 |
1.94 GB (1.93 BPW)
|
22 |
1.81 GiB (1.93 BPW)
|
23 |
-
|
24 |
PPL over 564 chunks for n_ctx=512 = 40.0024 +/- 0.27710
|
25 |
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
IQ1_S - Unusable on <30B models
|
28 |
Master
|
@@ -35,6 +42,11 @@ PR
|
|
35 |
1.91 GiB (2.04 BPW)
|
36 |
PPL over 564 chunks for n_ctx=512 = 25.2524 +/- 0.17651
|
37 |
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
IQ1_M
|
40 |
Master
|
|
|
12 |
The iMatrix I use is based on Group Merged V3 and enriched with a bit of French,
|
13 |
a bit of Serbian, and a bit of Croatian languages.
|
14 |
|
15 |
+
As usual, the name of the quants are a bit pompous,
|
16 |
+
because they are numbered on the type of tensor quant mainly used as a base for the FFN.
|
17 |
+
|
18 |
|
19 |
ARC and PPL-512 DATA (Get the last data on the main post of the PR thread) :
|
20 |
|
|
|
23 |
PR
|
24 |
1.94 GB (1.93 BPW)
|
25 |
1.81 GiB (1.93 BPW)
|
|
|
26 |
PPL over 564 chunks for n_ctx=512 = 40.0024 +/- 0.27710
|
27 |
|
28 |
+
PR2
|
29 |
+
1.98 GB (1.97 BPW)
|
30 |
+
1.84 GiB (1.97 BPW)
|
31 |
+
PPL over 564 chunks for n_ctx=512 = 33.5198 +/- 0.24187
|
32 |
+
|
33 |
|
34 |
IQ1_S - Unusable on <30B models
|
35 |
Master
|
|
|
42 |
1.91 GiB (2.04 BPW)
|
43 |
PPL over 564 chunks for n_ctx=512 = 25.2524 +/- 0.17651
|
44 |
|
45 |
+
PR2
|
46 |
+
2.06 GB (2.05 BPW)
|
47 |
+
1.91 GiB (2.05 BPW)
|
48 |
+
PPL over 564 chunks for n_ctx=512 = 24.2661 +/- 0.16923
|
49 |
+
|
50 |
|
51 |
IQ1_M
|
52 |
Master
|