woofwolfy commited on
Commit
4722f32
·
verified ·
1 Parent(s): fdcb71a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -40
README.md CHANGED
@@ -6,46 +6,43 @@ tags:
6
  - gguf-my-repo
7
  ---
8
 
9
- # woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF
10
  This model was converted to GGUF format from [`ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1`](https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
11
  Refer to the [original model card](https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1) for more details on the model.
12
 
13
- ## Use with llama.cpp
14
- Install llama.cpp through brew (works on Mac and Linux)
15
-
16
- ```bash
17
- brew install llama.cpp
18
-
19
- ```
20
- Invoke the llama.cpp server or the CLI.
21
-
22
- ### CLI:
23
- ```bash
24
- llama-cli --hf-repo woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF --hf-file arliai-rpmax-phi-3.8b-v1.1-q4_k_m-imat.gguf -p "The meaning to life and the universe is"
25
- ```
26
-
27
- ### Server:
28
- ```bash
29
- llama-server --hf-repo woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF --hf-file arliai-rpmax-phi-3.8b-v1.1-q4_k_m-imat.gguf -c 2048
30
- ```
31
-
32
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
33
-
34
- Step 1: Clone llama.cpp from GitHub.
35
- ```
36
- git clone https://github.com/ggerganov/llama.cpp
37
- ```
38
-
39
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
40
- ```
41
- cd llama.cpp && LLAMA_CURL=1 make
42
- ```
43
-
44
- Step 3: Run inference through the main binary.
45
- ```
46
- ./llama-cli --hf-repo woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF --hf-file arliai-rpmax-phi-3.8b-v1.1-q4_k_m-imat.gguf -p "The meaning to life and the universe is"
47
- ```
48
- or
49
- ```
50
- ./llama-server --hf-repo woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF --hf-file arliai-rpmax-phi-3.8b-v1.1-q4_k_m-imat.gguf -c 2048
51
- ```
 
6
  - gguf-my-repo
7
  ---
8
 
9
+ # woofwolfy/ArliAI-RPMax-Phi-3.8B-v1.1-Q4_K_M-GGUF-Imatrix
10
  This model was converted to GGUF format from [`ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1`](https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
11
  Refer to the [original model card](https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1) for more details on the model.
12
 
13
+ # ArliAI-RPMax-3.8B-v1.1
14
+ =====================================
15
+
16
+ ## Overview
17
+
18
+ This repository is based on the Phi-3.5-Mini-Instruct model and is governed by the MIT License agreement: https://huggingface.co/microsoft/Phi-3.5-mini-instruct
19
+
20
+ ## Model Description
21
+
22
+ ArliAI-RPMax-3.8B-v1.1 is trained on a diverse set of curated RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive, with a unique approach to training that minimizes repetition.
23
+
24
+ Although this is finetuned on the same RPMax v1.1 dataset as usual, this model will not be as good as the 8B, 12B versions of RPMax due to the lower parameter and stricter censoring that Phi 3.5 has.
25
+
26
+ If you want to access the larger RPMax model you can use them https://arliai.com which directly helps fund our model training as well.
27
+
28
+ Or you can always download them and run it yourself if you have the hardware.
29
+
30
+ ### Training Details
31
+
32
+ * **Sequence Length**: 16384
33
+ * **Training Duration**: Approximately 1 day on RTX 4090
34
+ * **Epochs**: 1 epoch training for minimized repetition sickness
35
+ * **QLORA**: 64-rank 128-alpha, resulting in ~2% trainable weights
36
+ * **Learning Rate**: 0.00001
37
+ * **Gradient accumulation**: Very low 32 for better learning.
38
+
39
+ ## Quantization
40
+
41
+ The model is available in quantized formats:
42
+
43
+ * **FP16**: https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1
44
+ * **GGUF**: https://huggingface.co/ArliAI/ArliAI-RPMax-Phi-3.8B-v1.1-GGUF
45
+
46
+ ## Suggested Prompt Format
47
+
48
+ Phi 3.5 Instruct Prompt Format