Update README.md
Browse files
README.md
CHANGED
@@ -3,8 +3,8 @@ license: other
|
|
3 |
---
|
4 |
5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.
|
5 |
|
6 |
-
|
7 |
|
8 |
-
|
9 |
|
10 |
-
|
|
|
3 |
---
|
4 |
5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.
|
5 |
|
6 |
+
Update 21/09/23
|
7 |
|
8 |
+
Re-quanted with latest exllamav2 version, which fixed some measurement issues.
|
9 |
|
10 |
+
Also, now 5bpw works on 2x24GB VRAM cards, using gpu_split 21,21 and flash-attn (only Linux for now), for 4096 context and 1 GB to spare, to try for more.
|