kaiokendev commited on
Commit
b9ae2ab
1 Parent(s): ca2e92e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -7,12 +7,15 @@ license: mit
7
  This is a second prototype of SuperHOT, this time 30B with 8K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
8
  Tests have shown that the model does indeed leverage the extended context at 8K.
9
 
10
- You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
11
-
12
  #### Looking for Merged & Quantized Models?
13
  - 30B 4-bit CUDA: [tmpupload/superhot-30b-8k-4bit-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-safetensors)
14
  - 30B 4-bit CUDA 128g: [tmpupload/superhot-30b-8k-4bit-128g-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-128g-safetensors)
15
 
 
 
 
 
 
16
 
17
  #### Training Details
18
  I trained the LoRA with the following configuration:
 
7
  This is a second prototype of SuperHOT, this time 30B with 8K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
8
  Tests have shown that the model does indeed leverage the extended context at 8K.
9
 
 
 
10
  #### Looking for Merged & Quantized Models?
11
  - 30B 4-bit CUDA: [tmpupload/superhot-30b-8k-4bit-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-safetensors)
12
  - 30B 4-bit CUDA 128g: [tmpupload/superhot-30b-8k-4bit-128g-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-128g-safetensors)
13
 
14
+ #### Using the monkey-patch?
15
+ You will **NEED** to **apply the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
16
+
17
+ #### Using Oobabooga with Exllama?
18
+ - `python server.py --max_seq_len 8192 --compress_pos_emb 4 --loader exllama_hf`
19
 
20
  #### Training Details
21
  I trained the LoRA with the following configuration: