Hermes-3-Llama-3.1-8B-lorablated-exl2
Model: Hermes-3-Llama-3.1-8B-lorablated
Created by: mlabonne
Based on: Hermes-3-Llama-3.1-8B
Quants
4bpw h6
4.5bpw h6
5bpw h6
6bpw h6
8bpw h8
Quantization notes
Made with Exllamav2 0.1.8 with the default dataset.
I'm not sure how well it works with Text-Generation-WebUI considering that this model uses some unusual RoPE mechanics and I have no idea how TGW handles it.
For some reason this model worked extremely slow with my TGW install but was perfectly fine with TabbyAPI.
How to run
I recommend using TabbyAPI for this model. The model requires a decent Nvidia RTX card on Windows/Linux or a decent AMD GPU on Linux.
It requires to be fully loaded in GPU to work, so if your GPU has too small VRAM you should use GGUF version instead.
If you have Nvidia GTX card you should also use GGUF instead.
Orignal model card
Hermes-3-Llama-3.1-8B-lorablated
This is an uncensored version of NousResearch/Hermes-3-Llama-3.1-8B using lorablation.
You can see in the following example how Hermes 3 refuses to answer a legitimate question while the abliterated model complies:
The recipe is based on @grimjim's grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter (special thanks):
- Extraction: We extract a LoRA adapter by comparing two models: a censored Llama 3.1 (meta-llama/Meta-Llama-3.1-8B-Instruct) and an abliterated Llama 3.1 (mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated).
- Merge: We merge this new LoRA adapter using task arithmetic to the censored NousResearch/Hermes-3-Llama-3.1-8B to abliterate it.
See this article to learn more about abliteration.
⚡ Quantization
🧩 Configuration
This model was merged using the task arithmetic merge method using NousResearch/Hermes-3-Llama-3.1-8B + Llama-3.1-8B-Instruct-abliterated-LORA as a base.
The following YAML configuration was used to produce this model:
base_model: NousResearch/Hermes-3-Llama-3.1-8B+Llama-3.1-8B-Instruct-abliterated-LORA
dtype: bfloat16
merge_method: task_arithmetic
parameters:
normalize: false
slices:
- sources:
- layer_range: [0, 32]
model: NousResearch/Hermes-3-Llama-3.1-8B+Llama-3.1-8B-Instruct-abliterated-LORA
parameters:
weight: 1.0
You can reproduce this model using the following commands:
# Setup
git clone https://github.com/arcee-ai/mergekit.git
cd mergekit && pip install -e .
pip install bitsandbytes
# Extraction
mergekit-extract-lora mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated meta-llama/Meta-Llama-3.1-8B-Instruct Llama-3.1-8B-Instruct-abliterated-LORA --rank=64
# Merge using previous config
mergekit-yaml config.yaml Hermes-3-Llama-3.1-8B-lorablated --allow-crimes --lora-merge-cache=./cache
- Downloads last month
- 8
Model tree for cgus/Hermes-3-Llama-3.1-8B-lorablated-exl2
Base model
meta-llama/Llama-3.1-8B