Edit model card

QuantFactory/Luna-AI-Llama2-Uncensored-GGUF

This is quantized version of Tap-M/Luna-AI-Llama2-Uncensored created using llama.cpp

Original Model Card

Model Description

“Luna AI Llama2 Uncensored” is a Llama2 based Chat model
fine-tuned on over 40,000 long form chat discussions
This model was fine-tuned by Tap, the creator of Luna AI.

Model Training

The fine-tuning process was performed on an 8x a100 80GB machine.
The model was trained on synthetic outputs which include multiple rounds of chats between Human & AI.

4bit GPTQ Version provided by @TheBloke - for GPU inference
GGML Version provided by @TheBloke - For CPU inference

Prompt Format

The model follows the Vicuna 1.1/ OpenChat format:

USER: I have difficulties in making friends, and I really need someone to talk to. Would you be my friend?

ASSISTANT: Of course! Friends are always here for each other. What do you like to do?

Benchmark Results

Task Version Metric Value Stderr
arc_challenge 0 acc_norm 0.5512 0.0146
hellaswag 0
mmlu 1 acc_norm 0.46521 0.036
truthfulqa_mc 1 mc2 0.4716 0.0155
Average - - 0.5114 0.0150
Downloads last month
381
GGUF
Model size
6.74B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .