roleplaiapp/SmallThinker-3B-Preview-Q3_K_M-GGUF

Repo: roleplaiapp/SmallThinker-3B-Preview-Q3_K_M-GGUF
Original Model: SmallThinker-3B Organization: PowerInfer Quantized File: smallthinker-3b-preview-q3_k_m.gguf Quantization: GGUF Quantization Method: Q3_K_M
Use Imatrix: False
Split Model: False

Overview

This is an GGUF Q3_K_M quantized version of SmallThinker-3B.

Quantization By

I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models. I hope the community finds these quantizations useful.

Andrew Webby @ RolePlai

Downloads last month
18
GGUF
Model size
3.4B params
Architecture
qwen2

3-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for roleplaiapp/SmallThinker-3B-Preview-Q3_K_M-GGUF

Base model

Qwen/Qwen2.5-3B
Quantized
(99)
this model

Datasets used to train roleplaiapp/SmallThinker-3B-Preview-Q3_K_M-GGUF