--- license: other datasets: - georgesung/wizard_vicuna_70k_unfiltered --- # Overview Fine-tuned [Llama-3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with an uncensored/unfiltered Wizard-Vicuna conversation dataset. Used QLoRA for fine-tuning. The model here includes the fp32 HuggingFace version, plus a quantized 4-bit q4_0 [gguf version](https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf?download=true). # Prompt style The model was trained with the following prompt style: ``` ### HUMAN: Hello ### RESPONSE: Hi, how are you? ### HUMAN: I'm fine. ### RESPONSE: How can I help you? ... ``` # Training code Code used to train the model is available [here](https://github.com/georgesung/llm_qlora). To reproduce the results: ``` git clone https://github.com/georgesung/llm_qlora cd llm_qlora pip install -r requirements.txt python train.py configs/llama3_8b_chat_uncensored.yaml ``` # Fine-tuning guide https://georgesung.github.io/ai/qlora-ift/ # Ollama inference First, install [Ollama](https://ollama.com/). Based on instructions [here](https://github.com/ollama/ollama/blob/main/README.md#import-from-gguf), run the following: ``` cd $MODEL_DIR_OF_CHOICE wget https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf ``` Create a file called `llama3-uncensored.modelfile` with the following: ``` FROM ./llama3_8b_chat_uncensored_q4_0.gguf TEMPLATE """{{ .System }} ### HUMAN: {{ .Prompt }} ### RESPONSE: """ PARAMETER stop "### HUMAN:" PARAMETER stop "### RESPONSE:" ``` Then run: ``` ollama create llama3-uncensored -f llama3-uncensored.modelfile ollama run llama3-uncensored ```