Edit model card

Llamacpp Quantizations of Hyperion-3.0-Yi-34B

Using llama.cpp release b2440 for quantization.

Original model: https://huggingface.co/Locutusque/Hyperion-3.0-Yi-34B

Download a file (not the whole branch) from below:

Filename Quant type File Size Description
Hyperion-3.0-Yi-34B-Q8_0.gguf Q8_0 36.54GB Extremely high quality, generally unneeded but max available quant.
Hyperion-3.0-Yi-34B-Q6_K.gguf Q6_K 28.21GB Very high quality, near perfect, recommended.
Hyperion-3.0-Yi-34B-Q5_K_M.gguf Q5_K_M 24.32GB High quality, very usable.
Hyperion-3.0-Yi-34B-Q5_K_S.gguf Q5_K_S 23.70GB High quality, very usable.
Hyperion-3.0-Yi-34B-Q5_0.gguf Q5_0 23.70GB High quality, older format, generally not recommended.
Hyperion-3.0-Yi-34B-Q4_K_M.gguf Q4_K_M 20.65GB Good quality, similar to 4.25 bpw.
Hyperion-3.0-Yi-34B-Q4_K_S.gguf Q4_K_S 19.59GB Slightly lower quality with small space savings.
Hyperion-3.0-Yi-34B-Q4_0.gguf Q4_0 19.46GB Decent quality, older format, generally not recommended.
Hyperion-3.0-Yi-34B-Q3_K_L.gguf Q3_K_L 18.13GB Lower quality but usable, good for low RAM availability.
Hyperion-3.0-Yi-34B-Q3_K_M.gguf Q3_K_M 16.65GB Even lower quality.
Hyperion-3.0-Yi-34B-Q3_K_S.gguf Q3_K_S 14.96GB Low quality, not recommended.
Hyperion-3.0-Yi-34B-Q2_K.gguf Q2_K 12.82GB Extremely low quality, not recommended.

Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

Downloads last month
100
GGUF
Model size
34.4B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train bartowski/Hyperion-3.0-Yi-34B-GGUF