README.md · alvarobartt/UltraCM-13B-GGUF at c429ff71eae2399ed4ed96195b5fb8f116f40f41

metadata

license: mit
language:
  - en
datasets:
  - openbmb/UltraFeedback
model_creator: OpenBMB
model_name: UltraCM-13b
model_type: llama
base_model: openbmb/UltraCM-13b
library_name: transformers
pipeline_tag: text-generation
inference: false
tags:
  - dpo
  - rlaif
  - preference
  - ultrafeedback
quantized_by: alvarobartt

Model Card for UltraCM-13b-GGUF

UltraCM-13B is a fine-tuned LLM for completion-critique in order to evaluate LLM outputs on helpfulness, truthfulness, honesty, and to what extent the answer follows the given instructions.

UltraCM-13B is a 13b param LLM that was released by OpenBMB, as part of their paper UltraFeedback: Boosting Language Models with High-quality Feedback.

This model contains the quantized variants using the GGUF format, introduced by the llama.cpp team, and also heavily inspired by TheBloke work on quantizing most of the LLMs out there.

Model Details

Model Description

Model type: Llama
Fine-tuned from model: Llama-2-13b-hf
Created by: Meta AI
Fine-tuned by: OpenBMB
Quantized by: alvarobartt
Language(s) (NLP): English
License: Apache 2.0

Model Files

Name	Quant method	Bits	Size	Max RAM required	Use case
UltraCM-13b.q4_0.gguf	Q4_0	4	3.83 GB	6.33 GB	legacy; small, very high quality loss - prefer using Q3_K_M
UltraCM-13b.q4_k_s.gguf	Q4_K_S	4	7.41 GB	9.91 GB	small, greater quality loss
UltraCM-13b.q4_k_m.gguf	Q4_K_M	4	7.87 GB	10.37 GB	medium, balanced quality - recommended
UltraCM-13b.q5_0.gguf	Q5_0	5	4.65 GB	7.15 GB	legacy; medium, balanced quality - prefer using Q4_K_M
UltraCM-13b.q5_k_s.gguf	Q5_K_S	5	8.97 GB	11.47 GB	large, low quality loss - recommended
UltraCM-13b.q5_k_m.gguf	Q5_K_M	5	9.23 GB	11.73 GB	large, very low quality loss - recommended

Note: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.

For more information on quantization, I'd highly suggest anyone reading to go check TheBloke out, as well as joining their Discord server.

Uses

Direct Use

[More Information Needed]

Citation

Since this is only a GGUF-quantization of the original weights, please refer and cite the original authors instead.

@misc{cui2023ultrafeedback,
      title={UltraFeedback: Boosting Language Models with High-quality Feedback}, 
      author={Ganqu Cui and Lifan Yuan and Ning Ding and Guanming Yao and Wei Zhu and Yuan Ni and Guotong Xie and Zhiyuan Liu and Maosong Sun},
      year={2023},
      eprint={2310.01377},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}