nisten commited on
Commit
b20a553
1 Parent(s): 0f5691d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -3
README.md CHANGED
@@ -20,9 +20,64 @@ You'll notice that these quantizations are slightly larger compared to others, b
20
  - iMatrix quantization can be applied to all k quantizations, not just the i ones.
21
  - 1bit quant gives garbage, but all else, including 2xxs are suprisingly very coherent
22
 
23
- ## TODO
24
 
25
- - Upload perplexity benchmarks of each quantization vs f16.
26
 
 
 
27
 
28
- ```./perplexity -m dolphin2m.gguf -f wiki.test.raw -ngl 34```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  - iMatrix quantization can be applied to all k quantizations, not just the i ones.
21
  - 1bit quant gives garbage, but all else, including 2xxs are suprisingly very coherent
22
 
23
+ ## Perplexity values
24
 
25
+ ```./perplexity -m dolphin2m.gguf -f wiki.test.raw -ngl 34```
26
 
27
+ ```dolphinf16.gguf perplexity - [1]4.3052,[2]4.8421,[3]5.7401,[4]6.6554,[5]6.6552,[6]6.6580,[7]6.9198,[8]7.0918,[9]7.2503,[10]7.5712,[11]7.8367,[12]7.8476,
28
+ Final estimate: PPL = 7.8476 +/- 0.35984 THIS IS BASELINE
29
 
30
+ dolphin1bit.gguf perplexity - [1]59477.7292,[2]50746.4580,[3]53932.3131,[4]55797.8433,[5]45995.5032,[6]46595.4234,[7]45130.6779,[8]40769.8593,[9]41322.7842,[10]50644.7393,[11]50676.5808,[12]51939.5094,
31
+ Final estimate: PPL = 51939.5094 +/- 1339.29301 1BIT GIVES GARBAGE OUTPUT
32
+
33
+ dolphin2xxs.gguf perplexity - [1]5.4651,[2]6.7941,[3]7.8700,[4]8.7155,[5]8.3566,[6]8.3316,[7]8.6121,[8]8.7565,[9]8.9041,[10]9.3572,[11]9.6426,[12]9.5626,
34
+ Final estimate: PPL = 9.5626 +/- 0.43895 9.5 vs 7.8 at f16, means lossy but coherent
35
+
36
+ dolphin2s.gguf perplexity - [1]5.0014,[2]5.9477,[3]6.8424,[4]7.6348,[5]7.4755,[6]7.4667,[7]7.7625,[8]7.8807,[9]8.0374,[10]8.4086,[11]8.6475,[12]8.6427,
37
+ Final estimate: PPL = 8.6427 +/- 0.39501
38
+
39
+ dolphin2m.gguf perplexity - [1]4.5874,[2]5.3203,[3]6.2334,[4]7.1444,[5]7.1188,[6]7.1422,[7]7.4717,[8]7.6180,[9]7.7948,[10]8.1319,[11]8.3747,[12]8.4095,
40
+ Final estimate: PPL = 8.4095 +/- 0.38329
41
+
42
+ dolphin2k.gguf perplexity - [1]4.6331,[2]5.2648,[3]6.0493,[4]7.0165,[5]6.9300,[6]6.9177,[7]7.2362,[8]7.4417,[9]7.6292,[10]7.9640,[11]8.2121,[12]8.1930,
43
+ Final estimate: PPL = 8.1930 +/- 0.37241
44
+
45
+ dolphin2ks.gguf perplexity - [1]4.7995,[2]5.6653,[3]6.4331,[4]7.3841,[5]7.2724,[6]7.3161,[7]7.6567,[8]7.8423,[9]8.0129,[10]8.4033,[11]8.6636,[12]8.6391,
46
+ Final estimate: PPL = 8.6391 +/- 0.39315
47
+
48
+ dolphin3s.gguf perplexity - [1]4.3574,[2]4.9936,[3]5.8814,[4]6.8093,[5]6.8086,[6]6.7949,[7]7.0638,[8]7.2204,[9]7.3844,[10]7.6895,[11]7.9489,[12]7.9527,
49
+ Final estimate: PPL = 7.9527 +/- 0.36202
50
+
51
+ dolphin3xs.gguf perplexity - [1]4.3161,[2]4.9579,[3]5.8647,[4]6.8064,[5]6.7614,[6]6.7501,[7]7.0133,[8]7.2103,[9]7.3862,[10]7.7265,[11]7.9813,[12]7.9780,
52
+ Final estimate: PPL = 7.9780 +/- 0.36655
53
+
54
+ dolphin3xxs.gguf perplexity - [1]4.5418,[2]5.0902,[3]6.0117,[4]6.9852,[5]6.9329,[6]6.9165,[7]7.1853,[8]7.3359,[9]7.4923,[10]7.8122,[11]8.0696,[12]8.0592,
55
+ Final estimate: PPL = 8.0592 +/- 0.36502
56
+
57
+ dolphin3m.gguf perplexity - [1]4.3203,[2]4.9566,[3]5.8151,[4]6.7619,[5]6.7801,[6]6.7762,[7]7.0351,[8]7.2054,[9]7.3766,[10]7.6896,[11]7.9580,[12]7.9660,
58
+ Final estimate: PPL = 7.9660 +/- 0.36234
59
+
60
+ dolphin4km.gguf perplexity - [1]4.3331,[2]4.9129,[3]5.7915,[4]6.7030,[5]6.6921,[6]6.6978,[7]6.9570,[8]7.1284,[9]7.2854,[10]7.6098,[11]7.8696,[12]7.8767,
61
+ Final estimate: PPL = 7.8767 +/- 0.35875
62
+
63
+ dolphin4nl.gguf perplexity - [1]4.2682,[2]4.8494,[3]5.7530,[4]6.6890,[5]6.6672,[6]6.6637,[7]6.9332,[8]7.1126,[9]7.2821,[10]7.5998,[11]7.8733,[12]7.8875,
64
+ Final estimate: PPL = 7.8875 +/- 0.36227
65
+
66
+ dolphin4xs.gguf perplexity - [1]4.2986,[2]4.8610,[3]5.7658,[4]6.6906,[5]6.6621,[6]6.6608,[7]6.9321,[8]7.1140,[9]7.2892,[10]7.6085,[11]7.8806,[12]7.8921,
67
+ Final estimate: PPL = 7.8921 +/- 0.36258
68
+
69
+ dolphin5ks.gguf perplexity - [1]4.2557,[2]4.8249,[3]5.7413,[4]6.6671,[5]6.6611,[6]6.6686,[7]6.9389,[8]7.1079,[9]7.2707,[10]7.5962,[11]7.8529,[12]7.8627,
70
+ Final estimate: PPL = 7.8627 +/- 0.36124
71
+
72
+ dolphin5km.gguf perplexity - [1]4.3191,[2]4.8597,[3]5.7844,[4]6.7120,[5]6.6994,[6]6.6964,[7]6.9569,[8]7.1215,[9]7.2792,[10]7.6109,[11]7.8682,[12]7.8794,
73
+ Final estimate: PPL = 7.8794 +/- 0.36185
74
+
75
+ dolphin6k.gguf perplexity - [1]4.3264,[2]4.8531,[3]5.7574,[4]6.6741,[5]6.6707,[6]6.6795,[7]6.9362,[8]7.1076,[9]7.2678,[10]7.5864,[11]7.8496,[12]7.8628,
76
+ Final estimate: PPL = 7.8628 +/- 0.36075
77
+
78
+ dolphin8bit.gguf perplxity - [1]4.3063,[2]4.8463,[3]5.7347,[4]6.6499,[5]6.6471,[6]6.6531,[7]6.9160,[8]7.0899,[9]7.2509,[10]7.5705,[11]7.8357,[12]7.8466,
79
+ Final estimate: PPL = 7.8466 +/- 0.35948
80
+ ```
81
+
82
+
83
+ As we can see 2bit xxs with this method actually is surprisingly coherent.