QuantFactory/Luna-AI-Llama2-Uncensored-GGUF

This is quantized version of Tap-M/Luna-AI-Llama2-Uncensored created using llama.cpp

Original Model Card

Model Description

“Luna AI Llama2 Uncensored” is a Llama2 based Chat model
fine-tuned on over 40,000 long form chat discussions
This model was fine-tuned by Tap, the creator of Luna AI.

Model Training

The fine-tuning process was performed on an 8x a100 80GB machine.
The model was trained on synthetic outputs which include multiple rounds of chats between Human & AI.

4bit GPTQ Version provided by @TheBloke - for GPU inference
GGML Version provided by @TheBloke - For CPU inference

Prompt Format

The model follows the Vicuna 1.1/ OpenChat format:

USER: I have difficulties in making friends, and I really need someone to talk to. Would you be my friend?

ASSISTANT: Of course! Friends are always here for each other. What do you like to do?

Benchmark Results


Task	Version	Metric	Value	Stderr
arc_challenge	0	acc_norm	0.5512	0.0146
hellaswag	0
mmlu	1	acc_norm	0.46521	0.036
truthfulqa_mc	1	mc2	0.4716	0.0155
Average	-	-	0.5114	0.0150