MarsupialAI commited on
Commit
0e51047
1 Parent(s): 1f787c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -10,11 +10,13 @@ So to test this crazy theory, I downloaded Undi95/Meta-Llama-3-8B-Instruct-hf an
10
  - "Auto" with no outtype specified
11
 
12
  I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw
13
- text file
14
-
15
- The results:
16
-
17
 
 
 
 
 
 
18
 
19
 
20
  As you can see, converting to fp32 has no meaningful effect on PPL compared to converting to fp16. There will no doubt be some
 
10
  - "Auto" with no outtype specified
11
 
12
  I then quantized each of these conversions to Q4_K_M and ran perplexity tests on everything using my abbreviated wiki.short.raw
13
+ text file. The results:
 
 
 
14
 
15
+ ````
16
+ FP16 specified: size 14.9GB PPL @ fp16 9.5158 +/- 0.15418 PPL @ Q4km 9.6414 +/- 0.15494
17
+ FP32 specified: size 29.9GB PPL @ fp32 9.5158 +/- 0.15418 PPL @ Q4km 9.6278 +/- 0.15466
18
+ None specified: size 29.9GB PPL @ ???? 9.5158 +/- 0.15418 PPL @ Q4km 9.6278 +/- 0.15466
19
+ ````
20
 
21
 
22
  As you can see, converting to fp32 has no meaningful effect on PPL compared to converting to fp16. There will no doubt be some