HF Leaderboard pegs this as one of the highest 32B parameter model, how is the quantized Q4 version ?
#1
by
bdutta
- opened
This model has several things going for it... one of the highest 32B param models on HF Leaderboard, with a Q4 quantized flavour available, that too with MLX support -- just what I think I need. Wondering if there is any indication of it's performance difference compared to the unquantized flavour of Rombos-LLM-V2.5-Qwen-32B ?