mobicham commited on
Commit
7fd9eec
1 Parent(s): 247e309

correct decoding time

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -19,8 +19,8 @@ This is an <a href="https://github.com/mobiusml/hqq/">HQQ</a> all 4-bit (group-s
19
  ## Model Decoding Speed
20
  | Models | fp16| HQQ 4-bit/gs-64|
21
  |:-------------------:|:--------:|:----------------:|
22
- | Decoding - short seq (tokens/sec)| 10.5 (tokens/sec)** | 10.7 (tokens/sec)* |
23
- | Decoding - long seq (tokens/sec)| 9.5 (tokens/sec)** | 9.7 (tokens/sec)*|
24
 
25
  **: 2xA100 80GB<br>
26
  *: 1xA100 80GB
 
19
  ## Model Decoding Speed
20
  | Models | fp16| HQQ 4-bit/gs-64|
21
  |:-------------------:|:--------:|:----------------:|
22
+ | Decoding - short seq (tokens/sec)| 10.5 (tokens/sec)** | 23 (tokens/sec)* |
23
+ | Decoding - long seq (tokens/sec)| 9.5 (tokens/sec)** | 19 (tokens/sec)*|
24
 
25
  **: 2xA100 80GB<br>
26
  *: 1xA100 80GB