Update README.md
Browse files
README.md
CHANGED
@@ -20,9 +20,64 @@ You'll notice that these quantizations are slightly larger compared to others, b
|
|
20 |
- iMatrix quantization can be applied to all k quantizations, not just the i ones.
|
21 |
- 1bit quant gives garbage, but all else, including 2xxs are suprisingly very coherent
|
22 |
|
23 |
-
##
|
24 |
|
25 |
-
|
26 |
|
|
|
|
|
27 |
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
- iMatrix quantization can be applied to all k quantizations, not just the i ones.
|
21 |
- 1bit quant gives garbage, but all else, including 2xxs are suprisingly very coherent
|
22 |
|
23 |
+
## Perplexity values
|
24 |
|
25 |
+
```./perplexity -m dolphin2m.gguf -f wiki.test.raw -ngl 34```
|
26 |
|
27 |
+
```dolphinf16.gguf perplexity - [1]4.3052,[2]4.8421,[3]5.7401,[4]6.6554,[5]6.6552,[6]6.6580,[7]6.9198,[8]7.0918,[9]7.2503,[10]7.5712,[11]7.8367,[12]7.8476,
|
28 |
+
Final estimate: PPL = 7.8476 +/- 0.35984 THIS IS BASELINE
|
29 |
|
30 |
+
dolphin1bit.gguf perplexity - [1]59477.7292,[2]50746.4580,[3]53932.3131,[4]55797.8433,[5]45995.5032,[6]46595.4234,[7]45130.6779,[8]40769.8593,[9]41322.7842,[10]50644.7393,[11]50676.5808,[12]51939.5094,
|
31 |
+
Final estimate: PPL = 51939.5094 +/- 1339.29301 1BIT GIVES GARBAGE OUTPUT
|
32 |
+
|
33 |
+
dolphin2xxs.gguf perplexity - [1]5.4651,[2]6.7941,[3]7.8700,[4]8.7155,[5]8.3566,[6]8.3316,[7]8.6121,[8]8.7565,[9]8.9041,[10]9.3572,[11]9.6426,[12]9.5626,
|
34 |
+
Final estimate: PPL = 9.5626 +/- 0.43895 9.5 vs 7.8 at f16, means lossy but coherent
|
35 |
+
|
36 |
+
dolphin2s.gguf perplexity - [1]5.0014,[2]5.9477,[3]6.8424,[4]7.6348,[5]7.4755,[6]7.4667,[7]7.7625,[8]7.8807,[9]8.0374,[10]8.4086,[11]8.6475,[12]8.6427,
|
37 |
+
Final estimate: PPL = 8.6427 +/- 0.39501
|
38 |
+
|
39 |
+
dolphin2m.gguf perplexity - [1]4.5874,[2]5.3203,[3]6.2334,[4]7.1444,[5]7.1188,[6]7.1422,[7]7.4717,[8]7.6180,[9]7.7948,[10]8.1319,[11]8.3747,[12]8.4095,
|
40 |
+
Final estimate: PPL = 8.4095 +/- 0.38329
|
41 |
+
|
42 |
+
dolphin2k.gguf perplexity - [1]4.6331,[2]5.2648,[3]6.0493,[4]7.0165,[5]6.9300,[6]6.9177,[7]7.2362,[8]7.4417,[9]7.6292,[10]7.9640,[11]8.2121,[12]8.1930,
|
43 |
+
Final estimate: PPL = 8.1930 +/- 0.37241
|
44 |
+
|
45 |
+
dolphin2ks.gguf perplexity - [1]4.7995,[2]5.6653,[3]6.4331,[4]7.3841,[5]7.2724,[6]7.3161,[7]7.6567,[8]7.8423,[9]8.0129,[10]8.4033,[11]8.6636,[12]8.6391,
|
46 |
+
Final estimate: PPL = 8.6391 +/- 0.39315
|
47 |
+
|
48 |
+
dolphin3s.gguf perplexity - [1]4.3574,[2]4.9936,[3]5.8814,[4]6.8093,[5]6.8086,[6]6.7949,[7]7.0638,[8]7.2204,[9]7.3844,[10]7.6895,[11]7.9489,[12]7.9527,
|
49 |
+
Final estimate: PPL = 7.9527 +/- 0.36202
|
50 |
+
|
51 |
+
dolphin3xs.gguf perplexity - [1]4.3161,[2]4.9579,[3]5.8647,[4]6.8064,[5]6.7614,[6]6.7501,[7]7.0133,[8]7.2103,[9]7.3862,[10]7.7265,[11]7.9813,[12]7.9780,
|
52 |
+
Final estimate: PPL = 7.9780 +/- 0.36655
|
53 |
+
|
54 |
+
dolphin3xxs.gguf perplexity - [1]4.5418,[2]5.0902,[3]6.0117,[4]6.9852,[5]6.9329,[6]6.9165,[7]7.1853,[8]7.3359,[9]7.4923,[10]7.8122,[11]8.0696,[12]8.0592,
|
55 |
+
Final estimate: PPL = 8.0592 +/- 0.36502
|
56 |
+
|
57 |
+
dolphin3m.gguf perplexity - [1]4.3203,[2]4.9566,[3]5.8151,[4]6.7619,[5]6.7801,[6]6.7762,[7]7.0351,[8]7.2054,[9]7.3766,[10]7.6896,[11]7.9580,[12]7.9660,
|
58 |
+
Final estimate: PPL = 7.9660 +/- 0.36234
|
59 |
+
|
60 |
+
dolphin4km.gguf perplexity - [1]4.3331,[2]4.9129,[3]5.7915,[4]6.7030,[5]6.6921,[6]6.6978,[7]6.9570,[8]7.1284,[9]7.2854,[10]7.6098,[11]7.8696,[12]7.8767,
|
61 |
+
Final estimate: PPL = 7.8767 +/- 0.35875
|
62 |
+
|
63 |
+
dolphin4nl.gguf perplexity - [1]4.2682,[2]4.8494,[3]5.7530,[4]6.6890,[5]6.6672,[6]6.6637,[7]6.9332,[8]7.1126,[9]7.2821,[10]7.5998,[11]7.8733,[12]7.8875,
|
64 |
+
Final estimate: PPL = 7.8875 +/- 0.36227
|
65 |
+
|
66 |
+
dolphin4xs.gguf perplexity - [1]4.2986,[2]4.8610,[3]5.7658,[4]6.6906,[5]6.6621,[6]6.6608,[7]6.9321,[8]7.1140,[9]7.2892,[10]7.6085,[11]7.8806,[12]7.8921,
|
67 |
+
Final estimate: PPL = 7.8921 +/- 0.36258
|
68 |
+
|
69 |
+
dolphin5ks.gguf perplexity - [1]4.2557,[2]4.8249,[3]5.7413,[4]6.6671,[5]6.6611,[6]6.6686,[7]6.9389,[8]7.1079,[9]7.2707,[10]7.5962,[11]7.8529,[12]7.8627,
|
70 |
+
Final estimate: PPL = 7.8627 +/- 0.36124
|
71 |
+
|
72 |
+
dolphin5km.gguf perplexity - [1]4.3191,[2]4.8597,[3]5.7844,[4]6.7120,[5]6.6994,[6]6.6964,[7]6.9569,[8]7.1215,[9]7.2792,[10]7.6109,[11]7.8682,[12]7.8794,
|
73 |
+
Final estimate: PPL = 7.8794 +/- 0.36185
|
74 |
+
|
75 |
+
dolphin6k.gguf perplexity - [1]4.3264,[2]4.8531,[3]5.7574,[4]6.6741,[5]6.6707,[6]6.6795,[7]6.9362,[8]7.1076,[9]7.2678,[10]7.5864,[11]7.8496,[12]7.8628,
|
76 |
+
Final estimate: PPL = 7.8628 +/- 0.36075
|
77 |
+
|
78 |
+
dolphin8bit.gguf perplxity - [1]4.3063,[2]4.8463,[3]5.7347,[4]6.6499,[5]6.6471,[6]6.6531,[7]6.9160,[8]7.0899,[9]7.2509,[10]7.5705,[11]7.8357,[12]7.8466,
|
79 |
+
Final estimate: PPL = 7.8466 +/- 0.35948
|
80 |
+
```
|
81 |
+
|
82 |
+
|
83 |
+
As we can see 2bit xxs with this method actually is surprisingly coherent.
|