Upload 9 files
Browse files- .gitattributes +5 -0
- Faraday Model Repository Banner.png +0 -0
- README.md +45 -0
- Senku-70B-Full.IQ2_XS.gguf +3 -0
- Senku-70B-Full.IQ2_XXS.gguf +3 -0
- Senku-70B-Full.IQ3_XS.gguf +3 -0
- Senku-70B-Full.IQ3_XXS.gguf +3 -0
- Senku-70B-Full.imatrix +3 -0
- faraday-logo.png +0 -0
- main.log +86 -0
.gitattributes
CHANGED
@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
Senku-70B-Full.imatrix filter=lfs diff=lfs merge=lfs -text
|
37 |
+
Senku-70B-Full.IQ2_XS.gguf filter=lfs diff=lfs merge=lfs -text
|
38 |
+
Senku-70B-Full.IQ2_XXS.gguf filter=lfs diff=lfs merge=lfs -text
|
39 |
+
Senku-70B-Full.IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
|
40 |
+
Senku-70B-Full.IQ3_XXS.gguf filter=lfs diff=lfs merge=lfs -text
|
Faraday Model Repository Banner.png
ADDED
README.md
ADDED
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: ShinojiResearch/Senku-70B-Full
|
3 |
+
license: other
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
library_name: transformers
|
7 |
+
pipeline_tag: text-generation
|
8 |
+
quantized_by: brooketh
|
9 |
+
tags:
|
10 |
+
- roleplay
|
11 |
+
- text-generation-inference
|
12 |
+
---
|
13 |
+
<img src="Faraday Model Repository Banner.png" alt="Faraday.dev" style="height: 90px; min-width: 32px; display: block; margin: auto;">
|
14 |
+
|
15 |
+
**<p style="text-align: center;">The official library of GGUF format models for use in the local AI chat app, Faraday.dev.</p>**
|
16 |
+
|
17 |
+
<p style="text-align: center;"><a href="https://faraday.dev/">Download Faraday here to get started.</a></p>
|
18 |
+
|
19 |
+
<p style="text-align: center;"><a href="https://www.reddit.com/r/LLM_Quants/">Request Additional models at r/LLM_Quants.</a></p>
|
20 |
+
|
21 |
+
***
|
22 |
+
# Senku 70B Full
|
23 |
+
- **Creator:** [ShinojiResearch](https://huggingface.co/ShinojiResearch/)
|
24 |
+
- **Original:** [Cerebrum 1.0 8x7b](https://huggingface.co/ShinojiResearch/Senku-70B-Full)
|
25 |
+
- **Date Created:** 2024-02-06
|
26 |
+
- **Trained Context:** 8192 tokens
|
27 |
+
- **Description:** Finetune of Mistral-70B on the Slimorca dataset. Exceptional at roleplay with the highest EQ Bench scores to date. Recommended for use with the ChatML prompt format.
|
28 |
+
|
29 |
+
## What is a GGUF?
|
30 |
+
GGUF is a large language model (LLM) format that can be split between CPU and GPU. GGUFs are compatible with applications based on llama.cpp, such as Faraday.dev. Where other model formats require higher end GPUs with ample VRAM, GGUFs can be efficiently run on a wider variety of hardware.
|
31 |
+
GGUF models are quantized to reduce resource usage, with a tradeoff of reduced coherence at lower quantizations. Quantization reduces the precision of the model weights by changing the number of bits used for each weight.
|
32 |
+
|
33 |
+
***
|
34 |
+
<img src="faraday-logo.png" alt="Faraday.dev" style="height: 75px; min-width: 32px; display: block; horizontal align: left;">
|
35 |
+
|
36 |
+
## Faraday.dev
|
37 |
+
- Free, local AI chat application.
|
38 |
+
- One-click installation on Mac and PC.
|
39 |
+
- Automatically use GPU for maximum speed.
|
40 |
+
- Built-in model manager.
|
41 |
+
- High-quality character hub.
|
42 |
+
- Zero-config desktop-to-mobile tethering.
|
43 |
+
Faraday makes it easy to start chatting with AI using your own characters or one of the many found in the built-in character hub. The model manager helps you find the latest and greatest models without worrying about whether it's the correct format. Faraday supports advanced features such as lorebooks, author's note, text formatting, custom context size, sampler settings, grammars, local TTS, cloud inference, and tethering, all implemented in a way that is straightforward and reliable.
|
44 |
+
**Join us on [Discord](https://discord.gg/SyNN2vC9tQ)**
|
45 |
+
***
|
Senku-70B-Full.IQ2_XS.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5e23a3158cef32db850ad6ea05f2e82ef2621b5e9c4fb48bf4f34105545ecfcf
|
3 |
+
size 20334163520
|
Senku-70B-Full.IQ2_XXS.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eaad5135beebc2e8c9fc7f98010e79dc9b9c297c656d71ca1c3f683647213f00
|
3 |
+
size 18289440320
|
Senku-70B-Full.IQ3_XS.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5adabd1dba5119269b496098658f35088b018c055ef8d0d570632752fe8bbd3f
|
3 |
+
size 28314973760
|
Senku-70B-Full.IQ3_XXS.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0464fd90a3c886979a553a2ed812848f9652f5f4d5c8ad73dcef60c6ea104a2b
|
3 |
+
size 26581464640
|
Senku-70B-Full.imatrix
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e722beb9027460e50a1baf264c16a5d4f9c43c6be2e02d2df152c2e6b0873c23
|
3 |
+
size 24922254
|
faraday-logo.png
ADDED
main.log
ADDED
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[1712111096] Log start
|
2 |
+
[1712111096] Cmd: c:\Apps\Toaster\bin\main.exe -m Senku-70B-Full.IQ2_XS.gguf
|
3 |
+
[1712111096] main: build = 2589 (bdf85d09)
|
4 |
+
[1712111096] main: built with MSVC 19.39.33523.0 for x64
|
5 |
+
[1712111096] main: seed = 1712111096
|
6 |
+
[1712111096] main: llama backend init
|
7 |
+
[1712111096] main: load the model and apply lora adapter, if any
|
8 |
+
[1712111096] llama_model_loader: loaded meta data with 24 key-value pairs and 723 tensors from Senku-70B-Full.IQ2_XS.gguf (version GGUF V3 (latest))
|
9 |
+
[1712111096] llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
|
10 |
+
[1712111096] llama_model_loader: - kv 0: general.architecture str = llama
|
11 |
+
[1712111096] llama_model_loader: - kv 1: general.name str = models
|
12 |
+
[1712111096] llama_model_loader: - kv 2: llama.vocab_size u32 = 32000
|
13 |
+
[1712111096] llama_model_loader: - kv 3: llama.context_length u32 = 32764
|
14 |
+
[1712111096] llama_model_loader: - kv 4: llama.embedding_length u32 = 8192
|
15 |
+
[1712111096] llama_model_loader: - kv 5: llama.block_count u32 = 80
|
16 |
+
[1712111096] llama_model_loader: - kv 6: llama.feed_forward_length u32 = 28672
|
17 |
+
[1712111096] llama_model_loader: - kv 7: llama.rope.dimension_count u32 = 128
|
18 |
+
[1712111096] llama_model_loader: - kv 8: llama.attention.head_count u32 = 64
|
19 |
+
[1712111096] llama_model_loader: - kv 9: llama.attention.head_count_kv u32 = 8
|
20 |
+
[1712111096] llama_model_loader: - kv 10: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
|
21 |
+
[1712111096] llama_model_loader: - kv 11: llama.rope.freq_base f32 = 1000000.000000
|
22 |
+
[1712111096] llama_model_loader: - kv 12: general.file_type u32 = 20
|
23 |
+
[1712111096] llama_model_loader: - kv 13: tokenizer.ggml.model str = llama
|
24 |
+
[1712111096] llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,32000] = ["<unk>", "<s>", "</s>", "<0x00>", "<...
|
25 |
+
[1712111096] llama_model_loader: - kv 15: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...
|
26 |
+
[1712111096] llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
|
27 |
+
[1712111096] llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 1
|
28 |
+
[1712111096] llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 2
|
29 |
+
[1712111096] llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 0
|
30 |
+
[1712111096] llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
|
31 |
+
[1712111096] llama_model_loader: - kv 21: tokenizer.ggml.add_eos_token bool = false
|
32 |
+
[1712111096] llama_model_loader: - kv 22: tokenizer.chat_template str = {{ bos_token }}{% for message in mess...
|
33 |
+
[1712111096] llama_model_loader: - kv 23: general.quantization_version u32 = 2
|
34 |
+
[1712111096] llama_model_loader: - type f32: 161 tensors
|
35 |
+
[1712111096] llama_model_loader: - type q2_K: 11 tensors
|
36 |
+
[1712111096] llama_model_loader: - type q4_K: 80 tensors
|
37 |
+
[1712111096] llama_model_loader: - type q5_K: 1 tensors
|
38 |
+
[1712111096] llama_model_loader: - type iq2_xs: 470 tensors
|
39 |
+
[1712111097] llm_load_vocab: special tokens definition check successful ( 259/32000 ).
|
40 |
+
[1712111097] llm_load_print_meta: format = GGUF V3 (latest)
|
41 |
+
[1712111097] llm_load_print_meta: arch = llama
|
42 |
+
[1712111097] llm_load_print_meta: vocab type = SPM
|
43 |
+
[1712111097] llm_load_print_meta: n_vocab = 32000
|
44 |
+
[1712111097] llm_load_print_meta: n_merges = 0
|
45 |
+
[1712111097] llm_load_print_meta: n_ctx_train = 32764
|
46 |
+
[1712111097] llm_load_print_meta: n_embd = 8192
|
47 |
+
[1712111097] llm_load_print_meta: n_head = 64
|
48 |
+
[1712111097] llm_load_print_meta: n_head_kv = 8
|
49 |
+
[1712111097] llm_load_print_meta: n_layer = 80
|
50 |
+
[1712111097] llm_load_print_meta: n_rot = 128
|
51 |
+
[1712111097] llm_load_print_meta: n_embd_head_k = 128
|
52 |
+
[1712111097] llm_load_print_meta: n_embd_head_v = 128
|
53 |
+
[1712111097] llm_load_print_meta: n_gqa = 8
|
54 |
+
[1712111097] llm_load_print_meta: n_embd_k_gqa = 1024
|
55 |
+
[1712111097] llm_load_print_meta: n_embd_v_gqa = 1024
|
56 |
+
[1712111097] llm_load_print_meta: f_norm_eps = 0.0e+00
|
57 |
+
[1712111097] llm_load_print_meta: f_norm_rms_eps = 1.0e-05
|
58 |
+
[1712111097] llm_load_print_meta: f_clamp_kqv = 0.0e+00
|
59 |
+
[1712111097] llm_load_print_meta: f_max_alibi_bias = 0.0e+00
|
60 |
+
[1712111097] llm_load_print_meta: f_logit_scale = 0.0e+00
|
61 |
+
[1712111097] llm_load_print_meta: n_ff = 28672
|
62 |
+
[1712111097] llm_load_print_meta: n_expert = 0
|
63 |
+
[1712111097] llm_load_print_meta: n_expert_used = 0
|
64 |
+
[1712111097] llm_load_print_meta: causal attn = 1
|
65 |
+
[1712111097] llm_load_print_meta: pooling type = 0
|
66 |
+
[1712111097] llm_load_print_meta: rope type = 0
|
67 |
+
[1712111097] llm_load_print_meta: rope scaling = linear
|
68 |
+
[1712111097] llm_load_print_meta: freq_base_train = 1000000.0
|
69 |
+
[1712111097] llm_load_print_meta: freq_scale_train = 1
|
70 |
+
[1712111097] llm_load_print_meta: n_yarn_orig_ctx = 32764
|
71 |
+
[1712111097] llm_load_print_meta: rope_finetuned = unknown
|
72 |
+
[1712111097] llm_load_print_meta: ssm_d_conv = 0
|
73 |
+
[1712111097] llm_load_print_meta: ssm_d_inner = 0
|
74 |
+
[1712111097] llm_load_print_meta: ssm_d_state = 0
|
75 |
+
[1712111097] llm_load_print_meta: ssm_dt_rank = 0
|
76 |
+
[1712111097] llm_load_print_meta: model type = 70B
|
77 |
+
[1712111097] llm_load_print_meta: model ftype = IQ2_XS - 2.3125 bpw
|
78 |
+
[1712111097] llm_load_print_meta: model params = 68.98 B
|
79 |
+
[1712111097] llm_load_print_meta: model size = 18.94 GiB (2.36 BPW)
|
80 |
+
[1712111097] llm_load_print_meta: general.name = models
|
81 |
+
[1712111097] llm_load_print_meta: BOS token = 1 '<s>'
|
82 |
+
[1712111097] llm_load_print_meta: EOS token = 2 '</s>'
|
83 |
+
[1712111097] llm_load_print_meta: UNK token = 0 '<unk>'
|
84 |
+
[1712111097] llm_load_print_meta: PAD token = 0 '<unk>'
|
85 |
+
[1712111097] llm_load_print_meta: LF token = 13 '<0x0A>'
|
86 |
+
[1712111097] llm_load_tensors: ggml ctx size = 0.28 MiB
|