v2ray commited on
Commit
a8d1b89
·
verified ·
1 Parent(s): d6627ad

Uploaded better trained version.

Browse files
README.md CHANGED
@@ -10,9 +10,11 @@ pipeline_tag: text-generation
10
  library_name: transformers
11
  ---
12
  # GPT4chan 24B
 
 
13
  This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) merged with [v2ray/GPT4chan-24B-QLoRA](https://huggingface.co/v2ray/GPT4chan-24B-QLoRA).
14
 
15
- Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 800 steps, which is approximately 1 epoch.
16
  ## Prompt Format
17
  ```
18
  board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
 
10
  library_name: transformers
11
  ---
12
  # GPT4chan 24B
13
+ ![GPT4chan Banner](https://huggingface.co/v2ray/GPT4chan-24B-QLoRA/resolve/main/images/banner.avif)
14
+
15
  This model is [mistralai/Mistral-Small-24B-Base-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Base-2501) merged with [v2ray/GPT4chan-24B-QLoRA](https://huggingface.co/v2ray/GPT4chan-24B-QLoRA).
16
 
17
+ Trained using 8x H100 with global batch size 64, using 2e-4 learning rate, for 4000 steps, which is approximately 5 epochs.
18
  ## Prompt Format
19
  ```
20
  board<|start_header_id|>id<|end_header_id|>content<|start_header_id|>id<|end_header_id|>content...<|start_header_id|>id<|end_header_id|>
model-00001-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:238b79ec430b6a38102cdb646116189ac45a08e9941952b0524319c00450363e
3
  size 9898729408
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:41cd0718baa0ae975171e5b50ad14a8ca4ffaeb1e59d4c882a9d56dcc953ee4b
3
  size 9898729408
model-00002-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:224400892a8ad196ee01ec1511e5cfd73a6cc57e161ded8b22be316287092a8f
3
  size 9668064384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2ee0bfa89960973d0847f66143a758f242876312b844f9d047f67c92c774650
3
  size 9668064384
model-00003-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3178a006772bbf5de3755e368f54c0c93183923dbda13bcc0d6e5c4be4699590
3
  size 9668064400
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01308b5bb3f5f86f97dd3e67388d86197b2fc31561a9da6cfa42a346fa036657
3
  size 9668064400
model-00004-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3b084372d1d3d9e4fe2e2dabb9cb8517efb894dc113f053ba92f60d7b5900f8f
3
  size 9961665680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:477879690661c77bfb25cac5f024ff30c27d274cc087b45ca0f22ef8aa9d0cd9
3
  size 9961665680
model-00005-of-00005.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:50f5c63282d89cf85efddba394c23647493cc0f3cf1b5db54fdf057608f089ab
3
  size 7948365856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:348f10663354eda68d077984ee70059172ee7c675e5799ed923fc47b2c15763d
3
  size 7948365856