gobean: the q4_k_m in the main model repo was hitting some tokenizer issues - after about five prompt exchanges it goes wild with repeats and unprintable tokens. I made this for comparison, it's not as bad as the q4_k_m but it isn't perfect. Both models fail the knowledge benchmark "describe the difference between a bear credit spread and a poor man's covered call," but they come really really close to getting it (just describes a standard covered call). Overall not bad. Inference is fast with q4_0 on 24gb vram - and the bug could very likely be with llama.cpp, so I may start looking into other frontends.

~ Original Model Card ~

πŸ™ GitHub β€’ πŸ‘Ύ Discord β€’ 🐀 Twitter β€’ πŸ’¬ WeChat
πŸ“ Paper β€’ πŸ™Œ FAQ β€’ πŸ“— Learning Hub

Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model Context Length Pre-trained Tokens
Yi-1.5 4K 3.6T

Models

Benchmarks

  • Chat models

    Yi-1.5-34B-Chat is on par with or excels beyond larger models in most benchmarks.

    image/png

    Yi-1.5-9B-Chat is the top performer among similarly sized open-source models.

    image/png

  • Base models

    Yi-1.5-34B is on par with or excels beyond larger models in some benchmarks.

    image/png

    Yi-1.5-9B is the top performer among similarly sized open-source models.

    image/png

Quick Start

For getting up and running with Yi-1.5 models quickly, see README.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.