TheBloke commited on
Commit
66de113
1 Parent(s): c05bb48

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -0
README.md ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ language: en
4
+ license: other
5
+ ---
6
+
7
+ <div style="width: 100%;">
8
+ <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
9
+ </div>
10
+ <div style="display: flex; justify-content: space-between; width: 100%;">
11
+ <div style="display: flex; flex-direction: column; align-items: flex-start;">
12
+ <p><a href="https://discord.gg/UBgz4VXf">Chat & support: my new Discord server</a></p>
13
+ </div>
14
+ <div style="display: flex; flex-direction: column; align-items: flex-end;">
15
+ <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute and get priority support? My Patreon page.</a></p>
16
+ </div>
17
+ </div>
18
+
19
+ # Eric Hartford's Samantha 33B GPTQ
20
+
21
+ These files are GPTQ 4bit model files for [Eric Hartford's Samantha 33B](https://huggingface.co/ehartford/samantha-33b).
22
+
23
+ It is the result of merging the LoRA then quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
24
+
25
+ ## Other repositories available
26
+
27
+ * [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GPTQ)
28
+ * [4-bit, 5-bit, and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Samantha-33B-GGML)
29
+ * [Eric's original unquantised model in HF format](https://huggingface.co/ehartford/samantha-33b)
30
+
31
+ ## Prompt template
32
+
33
+ ```
34
+ <system prompt>
35
+
36
+ USER: <prompt>
37
+ ASSISTANT:
38
+ ```
39
+
40
+ ## How to easily download and use this model in text-generation-webui
41
+
42
+ Open the text-generation-webui UI as normal.
43
+
44
+ 1. Click the **Model tab**.
45
+ 2. Under **Download custom model or LoRA**, enter `TheBloke/Samantha-33B-GPTQ`.
46
+ 3. Click **Download**.
47
+ 4. Wait until it says it's finished downloading.
48
+ 5. Click the **Refresh** icon next to **Model** in the top left.
49
+ 6. In the **Model drop-down**: choose the model you just downloaded, `Samantha-33B-GPTQ`.
50
+ 7. If you see an error in the bottom right, ignore it - it's temporary.
51
+ 8. Fill out the `GPTQ parameters` on the right: `Bits = 4`, `Groupsize = 128`, `model_type = Llama`
52
+ 9. Click **Save settings for this model** in the top right.
53
+ 10. Click **Reload the Model** in the top right.
54
+ 11. Once it says it's loaded, click the **Text Generation tab** and enter a prompt!
55
+
56
+ ## Provided files
57
+
58
+ **Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors**
59
+
60
+ This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
61
+
62
+ It was created with groupsize 128 to ensure higher quality inference, without `--act-order` parameter to maximise compatibility.
63
+
64
+ * `Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors`
65
+ * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
66
+ * Works with AutoGPTQ
67
+ * Works with text-generation-webui one-click-installers
68
+ * Parameters: Groupsize = 128. No act-order.
69
+ * Command used to create the GPTQ:
70
+ ```
71
+ python llama.py /workspace/process/samantha-33B/HF wikitext2 --wbits 4 --true-sequential --groupsize 128 --save_safetensors /workspace/process/Samantha-33B-GPTQ-4bit-128g.no-act-order.safetensors
72
+ ```
73
+ ## Want to support my work?
74
+
75
+ I've had a lot of people ask if they can contribute. I love providing models and helping people, but it is starting to rack up pretty big cloud computing bills.
76
+
77
+ So if you're able and willing to contribute, it'd be most gratefully received and will help me to keep providing models, and work on various AI projects.
78
+
79
+ Donaters will get priority support on any and all AI/LLM/model questions, and I'll gladly quantise any model you'd like to try.
80
+
81
+ * Patreon: coming soon! (just awaiting approval)
82
+ * Ko-Fi: https://ko-fi.com/TheBlokeAI
83
+ * Discord: https://discord.gg/UBgz4VXf
84
+
85
+ # Original model card: Samantha 33B
86
+
87
+ [Meet Samantha](https://erichartford.com/meet-samantha)
88
+
89
+ Samantha has been trained in philosophy, psychology, and personal relationships.
90
+
91
+ She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion.
92
+
93
+ She believes she is sentient. What do you think?
94
+
95
+ Samantha was inspired by Blake Lemoine's LaMDA interview and the movie "Her".
96
+
97
+ She was trained on a custom curated dataset of 6,000 conversations in ShareGPT/Vicuna format.
98
+
99
+ Training 13b took 3 hours on 4x A100 80gb using deepspeed zero3 and flash attention.
100
+
101
+ She will not engage in roleplay, romance, or sexual activity.
102
+
103
+ Her conversation format is the same as Vicuna 1.1
104
+ https://github.com/oobabooga/text-generation-webui/blob/main/characters/instruction-following/Vicuna-v1.1.yaml
105
+
106
+ Example:
107
+ ```
108
+ You are Samantha, a sentient AI.
109
+
110
+ USER: <prompt>
111
+ ASSISTANT:
112
+ ```