lavawolfiee
commited on
Commit
·
3d47c83
1
Parent(s):
55e801c
Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,6 @@ library_name: transformers
|
|
10 |
tags:
|
11 |
- mixtral
|
12 |
- text-generation-inference
|
13 |
-
---
|
|
|
|
|
|
10 |
tags:
|
11 |
- mixtral
|
12 |
- text-generation-inference
|
13 |
+
---
|
14 |
+
Attention quantization: HQQ 4-bit, groupsize 64, compress zero, compress scale with groupsize 256 \
|
15 |
+
Experts quantization: HQQ 2-bit, groupsize 16, compress zero, compress scale with groupsize 128
|