--- license: mit library_name: transformers datasets: - AI-MO/NuminaMath-CoT - KbsdJames/Omni-MATH - RUC-AIBOX/STILL-3-Preview-RL-Data - hendrycks/competition_math language: - en base_model: - agentica-org/DeepScaleR-1.5B-Preview pipeline_tag: text-generation --- # Melvin56/DeepScaleR-1.5B-Preview-GGUF Original Model : [agentica-org/DeepScaleR-1.5B-Preview](https://huggingface.co/agentica-org/DeepScaleR-1.5B-Preview) All quants are made using the imatrix option. | | CPU (AVX2) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute | | :------------ | :---------: | :---: | :----: | :-----: | :---: | :------: | :----: | :------: | | K-quants | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ 🐢5 | ✅ 🐢5 | ❌ | | I-quants | ✅ 🐢4 | ✅ 🐢4 | ✅ | ✅ | Partial¹ | ❌ | ❌ | ❌ | ``` ✅: feature works. 🚫: feature does not work ❓: unknown, please contribute if you can test it youself 🐢: feature is slow ¹: IQ3_S and IQ1_S, see #5886 ²: Only with -ngl 0 ³: Inference is 50% slower ⁴: Slower than K-quants of comparable size ⁵: Slower than cuBLAS/rocBLAS on similar cards ⁶: Only q8_0 and iq4_nl ``` | Model | Size (GB) | |:-------------------------------------------------|:-------------:| | Q2_K | 0.75 | | IQ3_XXS | 0.76 | | IQ3_XS | 0.83 | | IQ3_S | 0.86 | | IQ3_M | 0.87 | | Q3_K_M | 0.92 | | IQ4_XS | 1.01 | | Q4_K_M | 1.12 | | Q5_K_M | 1.28 | | Q6_K | 1.46 | | Q8_0 | 1.89 | | F16 | 3.55 |