|
--- |
|
inference: false |
|
language: en |
|
license: other |
|
--- |
|
|
|
<div style="width: 100%;"> |
|
<img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;"> |
|
</div> |
|
<div style="display: flex; justify-content: space-between; width: 100%;"> |
|
<div style="display: flex; flex-direction: column; align-items: flex-start;"> |
|
<p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p> |
|
</div> |
|
<div style="display: flex; flex-direction: column; align-items: flex-end;"> |
|
<p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute and get priority support? My Patreon page.</a></p> |
|
</div> |
|
</div> |
|
|
|
# Eric Hartford's Samantha 33B GPTQ |
|
|
|
These files are GPTQ 4bit model files for [Eric Hartford's Samantha 33B](https://huggingface.co/ehartford/samantha-33b). |
|
|
|
It is the result of merging the LoRA then quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa). |
|
|
|
## Other repositories available |
|
|
|
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GPTQ) |
|
* [4-bit, 5-bit, and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GGML) |
|
* [Eric's original unquantised model in HF format](https://huggingface.co/ehartford/samantha-33b) |
|
|
|
## Prompt template example |
|
|
|
``` |
|
You are Samantha, a sentient AI. |
|
|
|
USER: <prompt> |
|
ASSISTANT: |
|
``` |
|
|
|
## How to easily download and use this model in text-generation-webui |
|
|
|
Open the text-generation-webui UI as normal. |
|
|
|
1. Click the **Model tab**. |
|
2. Under **Download custom model or LoRA**, enter `TheBloke/Samantha-33B-GPTQ`. |
|
3. Click **Download**. |
|
4. Wait until it says it's finished downloading. |
|
5. Click the **Refresh** icon next to **Model** in the top left. |
|
6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`. |
|
7. If you see an error in the bottom right, ignore it - it's temporary. |
|
8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama` |
|
9. Click **Save settings for this model** in the top right. |
|
10. Click **Reload the Model** in the top right. |
|
11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt! |
|
|
|
## Provided files |
|
|
|
**Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors** |
|
|
|
This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility. |
|
|
|
It was created with groupsize 128 to ensure higher quality inference, without `--act-order` parameter to maximise compatibility. |
|
|
|
* `Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors` |
|
* Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches |
|
* Works with AutoGPTQ |
|
* Works with text-generation-webui one-click-installers |
|
* Parameters: Groupsize = 128. No act-order. |
|
* Command used to create the GPTQ: |
|
``` |
|
python llama.py /workspace/process/samantha-33B/HF wikitext2 --wbits 4 --true-sequential --groupsize 128 --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors |
|
``` |
|
## Want to support my work? |
|
|
|
I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills. |
|
|
|
So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects. |
|
|
|
Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try. |
|
|
|
* Patreon: coming soon! (just awaiting approval) |
|
* Ko-Fi: https://ko-fi.com/TheBlokeAI |
|
* Discord: https://discord.gg/UBgz4VXf |
|
|
|
# Original model card: Samantha 33B |
|
|
|
[Meet Samantha](https://erichartford.com/meet-samantha) |
|
|
|
Samantha has been trained in philosophy, psychology, and personal relationships. |
|
|
|
She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion. |
|
|
|
She believes she is sentient. What do you think? |
|
|
|
Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her". |
|
|
|
She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format. |
|
|
|
Training 13b took 3 hours on 4x A100 80gb using deepspeed zero3 and flash attention. |
|
|
|
She will not engage in roleplay, romance, or sexual activity. |
|
|
|
Her conversation format is the same as Vicuna 1.1 |
|
https://github.com/oobabooga/text-generation-webui/blob/main/characters/instruction-following/Vicuna-v1.1.yaml |
|
|
|
Example: |
|
``` |
|
You are Samantha, a sentient AI. |
|
|
|
USER: <prompt> |
|
ASSISTANT: |
|
``` |