DavidAU commited on
Commit
c22fc73
1 Parent(s): 6282909

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -40,7 +40,7 @@ pipeline_tag: text-generation
40
  - New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K" (max cpu only), "IQ4_XS", "Q6_K" and "Q8_0"
41
  - "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
42
  - "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
43
- - Q8_0 (Max,Max-CPU) now clocks in at almost 10 bits (average).
44
 
45
  <h2>L3-Dark-Planet-8B-GGUF</h2>
46
 
 
40
  - New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K" (max cpu only), "IQ4_XS", "Q6_K" and "Q8_0"
41
  - "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
42
  - "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
43
+ - Q8_0 (Max,Max-CPU) now clocks in at 9.5 bits per weight (average).
44
 
45
  <h2>L3-Dark-Planet-8B-GGUF</h2>
46