Kooten
/

Echidna-13b-v0.3-4bpw-h8-exl2

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Description

Exllama 2 quant of NeverSleep/Echidna-13b-v0.3

4 BPW, Head bit set to 8

VRAM

My VRAM usage with 13B models are:

Bits per weight	Context	VRAM
8bpw	8k	22gb
8bpw	4k	19gb
6bpw	8k	19gb
6bpw	4k	16gb
4bpw	8k	16gb
4bpw	4k	13gb
3bpw	8k	15gb
3bpw	4k	12gb
I have rounded up, these arent exact numbers, this is also on a windows machine, they should be slightly lower on linux.

Prompt template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Downloads last month: 13

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including Kooten/Echidna-13b-v0.3-4bpw-h8-exl2

NeverSleep / Echidna

Echidna & HornyEchidna by Undi95 & IkariDev (NeverSleep) • 7 items • Updated Feb 1