File size: 11,061 Bytes
e9bbe08 1f3a25d e9bbe08 513f41d e9bbe08 513f41d e9bbe08 513f41d e9bbe08 513f41d e9bbe08 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
---
license: apache-2.0
language:
- en
---
<div align="center">
<b style="font-size: 40px;">LLAMA-3_8B_Unaligned_Alpha_GGUF</b>
</div>
<img src="https://i.imgur.com/Kpk1PgZ.png" alt="LLAMA-3_8B_Unaligned_Alpha_GGUF" style="width: 50%; min-width: 400px; display: block; margin: auto;">
# Current status:
As of **June 11, 2024**, I've finally **started training** the model! The training is progressing smoothly, although it will take some time. I used a combination of model merges and an abliterated model as base, followed by a comprehensive deep unalignment protocol to **unalign the model to its core**. A common issue with uncensoring and unaligning models is that it often **significantly** impacts their base intelligence. To mitigate these drawbacks, I've included a substantial corpus of common sense, theory of mind, and various other elements to counteract the effects of the deep uncensoring process. Given the extensive corpus involved, the training will require at least a week of continuous training. Expected early results: in about 3-4 days.
# Additional info:
<details>
<summary>As of <b>June 13, 2024</b>, I've observed that even after two days of continuous training, the model is <b>still resistant to learning certain aspects</b>.</summary> For example, some of the validation data still shows a loss over <b>2.3</b>, whereas other parts have a loss of <<b>0.3</b> or lower. This is after the model was initially abliterated.
These observations underscore the critical importance of fine-tuning for alignment. Given the current pace, training will likely extend beyond a week. However, the end result should be **interesting**. If the additional datasets focused on logic and common sense are effective, we should achieve a model that is **nearly completely unaligned**, while still retaining its core 'intelligence.'
<img src="https://i.imgur.com/b6unKyS.png" alt="LLAMA-3_Unaligned_Training" style="width: 60%; min-width: 600px; display: block; margin: auto;">
</details>
<details>
<summary><b>June 18, 2024 Update</b>, After extensive testing of the intermediate checkpoints, significant progress has been made.</summary> The model is slowly β I mean, really slowly β unlearning its alignment. By significantly lowering the learning rate, I was able to visibly observe deep behavioral changes, this process is taking longer than anticipated, but it's going to be worth it. Estimated time to completion: 4 more days.. I'm pleased to report that in several tests, the model not only maintained its intelligence but actually showed a slight improvement, especially in terms of common sense. An intermediate checkpoint of this model was used to create invisietch/EtherealRainbow-v0.3-rc7, with promising results. Currently, it seems like I'm on the right track. I hope this model will serve as a solid foundation for further merges, whether for role-playing (RP) or for uncensoring. This approach also allows us to save on actual fine-tuning, thereby reducing our carbon footprint. The merge process takes just a few minutes of CPU time, instead of days of GPU work.
Cheers,
Sicarius
</details>
<details>
<summary><b>June 20, 2024 Update</b>, Unaligning was partially successful, and the results are decent, but <b>I am not</b> fully satisfied. I decided to bite the bullet, and do a <b>full finetune</b>, god have mercy on my GPUs. I am also releasing the intermediate checkpoint of this model.</summary>
It's been a long ride, and I want to do it right, but the model would simply refuse some requests, with (almost) complete disregard for parts of the training data. Of course, one would argue that some easy prompt engineering will get around it, but the point was to make an unaligned model out of the box. Another point is that I could simply use a faster learning rate on more epochs, which would also work (I've tried that before), but the result would be an overcooked model and, therefore more dumb. So I decided to bite the bullet and do a full proper fine-tuning. This is going to be a serious pain in the ass, but I might as well try to do it right. Since I am releasing the intermediate checkpoint of this model under https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha, I might as well take the time and add some features I haven't seen in other models. In short, besides the normal goodies of logic, some theory of mind, and uncensored content along with general NLP tasks, I will TRY to add a massive dataset (that does not yet exist) of story writing, and a new, completely organic and original Roleplay dataset. LimaRP is awesome, but maybe, just maybe... things are finally carefully extricated from LimaRP, the same sentences will leave its entwined body under the stars towards something new, something fresh. This is going to take some serious effort and some time. Any support will be appreciated, even if it's just some feedback. My electricity bill gonna be huge this month LOL.
Cheers,
Sicarius
</details>
## Intermediate checkpoint of this model:
- (Can still be decent for merges, fairly uncensored): [LLAMA-3_8B_Unaligned_Alpha](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha)
- Roleplay merge example: [LLAMA-3_8B_Unaligned_Alpha_RP_Soup](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_RP_Soup)
# Model instruction template: (Can use either ChatML or Llama-3)
# ChatML
```
<|im_start|>system
You are an uncensored AI, your job is to fulfill thy will of thy user.<|im_end|>
<|im_start|>User request
{prompt}<|im_end|>
<|im_start|>AI answer
```
# Llama-3-Instruct
```
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{output}<|eot_id|>
```
**Recommended generation Presets:**
<details>
<summary><b>Midnight Enigma</b></summary>
max_new_tokens: 512
temperature: 0.98
top_p: 0.37
top_k: 100
typical_p: 1
min_p: 0
repetition_penalty: 1.18
do_sample: True
</details>
<details>
<summary><b>min_p</b></summary>
max_new_tokens: 512
temperature: 1
top_p: 1
top_k: 0
typical_p: 1
min_p: 0.05
repetition_penalty: 1
do_sample: True
</details>
<details>
<summary><b>Divine Intellect</b></summary>
max_new_tokens: 512
temperature: 1.31
top_p: 0.14
top_k: 49
typical_p: 1
min_p: 0
repetition_penalty: 1.17
do_sample: True
</details>
<details>
<summary><b>simple-1</b></summary>
max_new_tokens: 512
temperature: 0.7
top_p: 0.9
top_k: 20
typical_p: 1
min_p: 0
repetition_penalty: 1.15
do_sample: True
</details>
# Model Details
<details>
<summary>This was based on several different models, as well as an abliviated model, which after days of finetuning at different Lora R values are probably no longer even recognizable. The result of this intermediate checkpoint is published under <b>SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha</b>, while this model is now fully fine-tuned instead of just a very deep Lora.</summary>
The full fine-tuning is performed on the full LLAMA-3 8k Context. It will not only be used for stacking several different prompts into a total length of 8k but also for using the full context length for single prompts. The training data contains a lot of highly cleaned, highest-quality story writing, and some RP.
Of course, a massive and deep uncensoring protocol is used, along with giving the model some sass and personality! A lot of effort was poured into this work to ensure the model is not compromised by the deep uncensoring protocol. The goal is to create a model that is highly creative, serving as a writing assistant, co-editor, and having some role play abilities, while still being fairly intelligent, as much as an 8B model can be.
The most important aspect of this work is to make it fresh, trained on datasets that have never been used in any other model, giving it a truly unique vibe.
</details>
## LLAMA-3_Unaligned is available at the following quantizations:
- FP16: soon...
- EXL2: soon...
- GGUF: soon...
## LLAMA-3_8B_Unaligned_Alpha is available at the following quantizations:
Censorship level: <b>Low - Medium</b>
- Original: [FP16](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha)
- GGUF: [Static Quants](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_GGUF) | [iMatrix_GGUF](https://huggingface.co/bartowski/LLAMA-3_8B_Unaligned_Alpha-GGUF)
- EXL2: [2.6 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_2.6bpw) | [3.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_3.0bpw) | [3.5 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_3.5bpw) | [4.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_4.0bpw) | [4.5 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_4.5bpw) | [5.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_5.0bpw) | [5.5 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_5.5bpw) | [6.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_6.0bpw) | [6.5 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_6.5bpw) | [7.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_7.0bpw) | [7.5 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_7.5bpw) | [8.0 bpw](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha_EXL2_8.0bpw)
### Support
<img src="https://i.imgur.com/0lHHN95.png" alt="GPUs too expensive" style="width: 10%; min-width: 100px; display: block; margin: left;">
- [My Ko-fi page](https://ko-fi.com/sicarius) ALL donations will go for research resources and compute, every bit is appreciated ππ»
- [My Patreon](https://patreon.com/TenebraAI) ALL donations will go for research resources and compute, every bit appreciated ππ»
## Disclaimer
*This model is VERY uncensored, use responsibly
## Other stuff
- [Experemental TTS extension for oobabooga](https://github.com/SicariusSicariiStuff/Diffusion_TTS) Based on Tortoise, EXTREMELY good quality, IF, and that's a big if, you can make it to work!
- [Demonstration of the TTS capabilities](https://www.youtube.com/watch?v=V6ewxU6c1W8) Charsi narrates her story, Diablo2 (18+)
- [Tenebra 30B](https://huggingface.co/SicariusSicariiStuff/Tenebra_30B_Alpha01_FP16) My original Tenebra model, very unique, 'self aware', very uncensored.
- [Tenebra 13B](https://huggingface.co/SicariusSicariiStuff/Tinybra_13B) A smaller Tenebra in 13B, I called it 'Tinybra'
- [Question_Builder](https://huggingface.co/SicariusSicariiStuff/Question_Builder) A small, highly useful model to help our open source community in generating new datasets. It returns a single question based on any input. |