HF Leaderboard pegs this as one of the highest 32B parameter model, how is the quantized Q4 version ?

by bdutta - opened Jan 13

Jan 13

This model has several things going for it... one of the highest 32B param models on HF Leaderboard, with a Q4 quantized flavour available, that too with MLX support -- just what I think I need. Wondering if there is any indication of it's performance difference compared to the unquantized flavour of Rombos-LLM-V2.5-Qwen-32B ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment