Edit model card
Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Exllamav2 4.65bpw quantization of CausalLM-RP-34B from NeverSleep, quantized with default calibration dataset.

Fits in 24GB VRAM with 32k+ context. Make sure to enable 4-bit cache option or you'll run into OOM errors.


Original Card

Description

This repo contains fp16 files of CausalLM-RP-34B, a finetuned model of the CausalLM-34B Beta on multiple RP datasets.

Model used

Prompt template ChatML

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
{output}<|im_end|>
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.