kaiokendev
commited on
Commit
•
b9ae2ab
1
Parent(s):
ca2e92e
Update README.md
Browse files
README.md
CHANGED
@@ -7,12 +7,15 @@ license: mit
|
|
7 |
This is a second prototype of SuperHOT, this time 30B with 8K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
|
8 |
Tests have shown that the model does indeed leverage the extended context at 8K.
|
9 |
|
10 |
-
You will need to **use either the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
|
11 |
-
|
12 |
#### Looking for Merged & Quantized Models?
|
13 |
- 30B 4-bit CUDA: [tmpupload/superhot-30b-8k-4bit-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-safetensors)
|
14 |
- 30B 4-bit CUDA 128g: [tmpupload/superhot-30b-8k-4bit-128g-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-128g-safetensors)
|
15 |
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
#### Training Details
|
18 |
I trained the LoRA with the following configuration:
|
|
|
7 |
This is a second prototype of SuperHOT, this time 30B with 8K context and no RLHF, using the same technique described in [the github blog](https://kaiokendev.github.io/til#extending-context-to-8k).
|
8 |
Tests have shown that the model does indeed leverage the extended context at 8K.
|
9 |
|
|
|
|
|
10 |
#### Looking for Merged & Quantized Models?
|
11 |
- 30B 4-bit CUDA: [tmpupload/superhot-30b-8k-4bit-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-safetensors)
|
12 |
- 30B 4-bit CUDA 128g: [tmpupload/superhot-30b-8k-4bit-128g-safetensors](https://huggingface.co/tmpupload/superhot-30b-8k-4bit-128g-safetensors)
|
13 |
|
14 |
+
#### Using the monkey-patch?
|
15 |
+
You will **NEED** to **apply the monkeypatch** or, if you are already using the monkeypatch, **change the scaling factor to 0.25 and the maximum sequence length to 8192**
|
16 |
+
|
17 |
+
#### Using Oobabooga with Exllama?
|
18 |
+
- `python server.py --max_seq_len 8192 --compress_pos_emb 4 --loader exllama_hf`
|
19 |
|
20 |
#### Training Details
|
21 |
I trained the LoRA with the following configuration:
|