Update README.md
Browse files
README.md
CHANGED
@@ -21,17 +21,19 @@ Further details (performance, usage, etc.) should refer to GitHub project page:
|
|
21 |
|
22 |
Metric: PPL, **lower is better**
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
|
|
27 |
-
|
|
28 |
-
|
|
29 |
-
|
|
30 |
-
|
|
31 |
-
|
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
|
|
|
|
35 |
|
36 |
## Others
|
37 |
|
|
|
21 |
|
22 |
Metric: PPL, **lower is better**
|
23 |
|
24 |
+
The model name with `-im` suffix is generated with important matrix, which has generally better performance.
|
25 |
+
|
26 |
+
| Quant | Size | PPL | PPL (`-im`) |
|
27 |
+
| :---: | -------: | ------------------: | ----------------------: |
|
28 |
+
| Q2_K | 2.96 GB | 17.7212 +/- 0.59814 | **14.9583 +/- 0.50455** |
|
29 |
+
| Q3_K | 3.74 GB | 8.6303 +/- 0.28481 | **8.4423 +/- 0.28087** |
|
30 |
+
| Q4_0 | 4.34 GB | 8.2513 +/- 0.27102 | **7.9077 +/- 0.25525** |
|
31 |
+
| Q4_K | 4.58 GB | 7.8897 +/- 0.25830 | **7.8279 +/- 0.25542** |
|
32 |
+
| Q5_0 | 5.21 GB | 7.7975 +/- 0.25639 | **7.7724 +/- 0.25625** |
|
33 |
+
| Q5_K | 5.34 GB | 7.7062 +/- 0.25218 | **7.6902 +/- 0.25170** |
|
34 |
+
| Q6_K | 6.14 GB | 7.6600 +/- 0.25043 | **7.6412 +/- 0.24949** |
|
35 |
+
| Q8_0 | 7.95 GB | 7.6512 +/- 0.25064 | 7.6512 +/- 0.25064 |
|
36 |
+
| F16 | 14.97 GB | 7.6389 +/- 0.25001 | N/A |
|
37 |
|
38 |
## Others
|
39 |
|