Panchovix commited on
Commit
64d454b
1 Parent(s): 1f50f54

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -3,4 +3,8 @@ license: other
3
  ---
4
  5 bit quantization of airoboros 70b 1.4.1, using exllama2.
5
 
6
- On 2x4090, 3072 ctx seems to work fine with 21.5,22.5 gpu_split and max_attention_size = 1024 ** 2 instead if 2048 ** 2.
 
 
 
 
 
3
  ---
4
  5 bit quantization of airoboros 70b 1.4.1, using exllama2.
5
 
6
+ On 2x4090, 3072 ctx seems to work fine with 21.5,22.5 gpu_split and max_attention_size = 1024 ** 2 instead if 2048 ** 2.
7
+
8
+ 4096 may be factible on a single 48GB VRAM GPU (like A6000)
9
+
10
+ Tests are welcome.