Pinkstack's picture
Update README.md
094bb93 verified
|
raw
history blame
2.2 kB
metadata
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gguf
  - reasoning
  - Qwen2
  - Qwen
license: apache-2.0
language:
  - en
pipeline_tag: text-generation

BY_PINKSTACK.png

πŸ§€ Which quant is right for you?

  • Q4: This model should be used for super low end devices like older phones or older laptops due to its very compact size, quality is okay but fully usable.
  • Q6: This model should be used on most modern devices, good quality and very quick responses.
  • Q8: This model should be used on most modern devices Responses are very high quality, but its a little slower than q6
  • BF16: This Lossless model should only be used if maximum quality is needed; it doesn't perform well speed wise, but text results are high quality.

Things you should be aware of when using PARM models (Pinkstack Accuracy Reasoning Models) πŸ§€

This PARM is based on Qwen 2.5 0.5B which has gotten extra training parameters so it would have similar outputs to O.1 Mini, We trained with this dataset. it is designed to run on any device, from your phone to high-end PC. that is why we've included a BF16 quant.

To use this model, you must use a service which supports the GGUF file format. Additionaly, this is the Prompt Template, it uses the Phi-3 template.

{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>

Or if you are using an anti prompt: <|end|><|assistant|>

Highly recommended to use with a system prompt.

Extra information

  • Developed by: Pinkstack
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen2.5-0.5b-instruct-bnb-4bit

This model was trained using Unsloth and Huggingface's TRL library.

Used this model? Don't forget to leave a like :)