georgesung
/

llama3_8b_chat_uncensored

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

georgesung commited on Apr 30

Commit

b42db76

•

1 Parent(s): a890433

Update README.md

Files changed (1) hide show

README.md +27 -0

README.md CHANGED Viewed

@@ -40,3 +40,30 @@ python train.py configs/llama3_8b_chat_uncensored.yaml
 # Fine-tuning guide
 https://georgesung.github.io/ai/qlora-ift/

 # Fine-tuning guide
 https://georgesung.github.io/ai/qlora-ift/
+# Ollama inference
+First, install [Ollama](https://ollama.com/). Based on instructions [here](https://github.com/ollama/ollama/blob/main/README.md#import-from-gguf), run the following:
+```
+cd $MODEL_DIR_OF_CHOICE
+wget https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf
+```
+Create a file called `llama3-uncensored.modelfile` with the following:
+```
+FROM ./llama3_8b_chat_uncensored_q4_0.gguf
+TEMPLATE """{{ .System }}
+### HUMAN:
+{{ .Prompt }}
+### RESPONSE:
+"""
+PARAMETER stop "### HUMAN:"
+PARAMETER stop "### RESPONSE:"
+```
+Then run:
+```
+ollama create llama3-uncensored -f llama3-uncensored.modelfile
+ollama run llama3-uncensored
+```