Text Generation
Transformers
llamafile
code
granite
Eval Results
Inference Endpoints
jartine commited on
Commit
ade73f1
1 Parent(s): de1686b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -268,6 +268,15 @@ context size to be available with llamafile for any given model, you can
268
  pass the `-c 0` flag. The default temperature for these llamafiles is 0.
269
  It can be changed, e.g. `--temp 0.8`.
270
 
 
 
 
 
 
 
 
 
 
271
  ## About Quantization
272
 
273
  Our own evaluation of this model leads us to believe that it works best
 
268
  pass the `-c 0` flag. The default temperature for these llamafiles is 0.
269
  It can be changed, e.g. `--temp 0.8`.
270
 
271
+ ## Benchmarks
272
+
273
+ | cpu\_info | model\_filename | size | test | t/s |
274
+ | -----------------------------------------: | ---------------------------------------: | ---------: | ------------: | --------------: |
275
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | granite-34b-code-instruct.Q8\_0 | 33.82 GiB | pp512 | 94.34 |
276
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | granite-34b-code-instruct.Q8\_0 | 33.82 GiB | tg16 | 5.61 |
277
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | granite-34b-code-instruct.Q5\_0 | 22.03 GiB | pp512 | 95.08 |
278
+ | AMD Ryzen Threadripper PRO 7995WX (znver4) | granite-34b-code-instruct.Q5\_0 | 22.03 GiB | tg16 | 7.78 |
279
+
280
  ## About Quantization
281
 
282
  Our own evaluation of this model leads us to believe that it works best