Edit model card

GGUF of Replete-AI Llama 3 11.5B Instruct V2

Quantized with llama.cpp commit b2710 b2780 b2876, verified no warnings in llama.cpp

Simple PPL comparison
perplexity.exe -[MODEL] -f wiki.test.raw -b 512 -ngl 99

Replete-AI_Llama-3-11.5B-Instruct-V2-Q6_K.gguf - Final estimate: Final estimate: PPL = 8.4438 +/- 0.06271
Meta-Llama-3-8B-Instruct-Q6_K - Final estimate: PPL = 8.4727 +/- 0.06308

Original model description below


Llama-3-11.5B-Instruct-v2

Thank you to Meta for the weights for Meta-Llama-3-8B-Instruct

image/png

This is an upscaling of the Meta-Llama-3-8B-Instruct Ai using techniques created for chargoddard/mistral-11b-slimorca. This Ai model has been upscaled from 8b parameters to 11.5b parameters without any continuous pretraining or fine-tuning.

Unlike version 1 this model has no issues at fp16 or any quantizations.

The model that was used to create this one is linked below:

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

Downloads last month
17
GGUF
Model size
11.5B params
Architecture
llama

4-bit

6-bit

Inference API
Unable to determine this model's library. Check the docs .