roleplaiapp
/

SmallThinker-3B-Preview-IQ4_NL-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

roleplaiapp commited on 2 days ago

Commit

1dad0d7

·

verified ·

1 Parent(s): 23f2b3d

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ tags:
 - llama-cpp
 - imatrix
 - gguf
-- Q8_0
 - 3b
 - SmallThinker
 - qwen
@@ -31,16 +31,16 @@ tags:
 # roleplaiapp/SmallThinker-3B-Preview-IQ4_NL-GGUF
 **Repo:** `roleplaiapp/SmallThinker-3B-Preview-IQ4_NL-GGUF`
-**Original Model:** `imatrix`
 **Organization:** `PowerInfer`
-**Quantized File:** `smallthinker-3b-preview-iq4_nl-imat.gguf`
 **Quantization:** `GGUF`
-**Quantization Method:** `Q8_0`
 **Use Imatrix:** `True`
 **Split Model:** `False`
 ## Overview
-This is an GGUF Q8_0 quantized version of [imatrix](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview).
 ## Quantization By
 I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models.

 - llama-cpp
 - imatrix
 - gguf
+- IQ4_NL
 - 3b
 - SmallThinker
 - qwen
 # roleplaiapp/SmallThinker-3B-Preview-IQ4_NL-GGUF
 **Repo:** `roleplaiapp/SmallThinker-3B-Preview-IQ4_NL-GGUF`
+**Original Model:** `SmallThinker-3B-Preview`
 **Organization:** `PowerInfer`
+**Quantized File:** `smallthinker-3b-preview-iq_4nl-imat.gguf`
 **Quantization:** `GGUF`
+**Quantization Method:** `IQ4_NL`
 **Use Imatrix:** `True`
 **Split Model:** `False`
 ## Overview
+This is an imatrix GGUF IQ4_NL quantized version of [SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview).
 ## Quantization By
 I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models.