Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

AutoTrain Compatible

8-bit precision

4-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

426

Full-text search

Active filters: vllm

NousResearch/DeepHermes-3-Llama-3-8B-Preview

Text Generation • Updated 3 days ago • 2.58k • 149

mistralai/Mistral-Small-24B-Instruct-2501

Text Generation • Updated 14 days ago • 640k • • 764

Almawave/Velvet-2B

Text Generation • Updated 4 days ago • 581 • 27

mistralai/Mistral-Small-24B-Base-2501

Text Generation • Updated 17 days ago • 16.6k • 216

Almawave/Velvet-14B

Text Generation • Updated 4 days ago • 4.35k • 118

bartowski/NousResearch_DeepHermes-3-Llama-3-8B-Preview-GGUF

Text Generation • Updated 3 days ago • 2.06k • 9

mistralai/Pixtral-12B-2409

Image-Text-to-Text • Updated Dec 26, 2024 • 604

mistralai/Ministral-8B-Instruct-2410

Updated Dec 6, 2024 • 50.6k • 426

mistralai/Mistral-Large-Instruct-2411

Updated Nov 19, 2024 • 9.54k • 204

SistInf/Velvet-14B-GGUF

Updated 4 days ago • 318 • 5

mistralai/Pixtral-12B-Base-2409

Updated 14 days ago • 84

stelterlab/Mistral-Small-24B-Instruct-2501-AWQ

Text Generation • Updated 17 days ago • 15k • 11

SistInf/Velvet-2B-GGUF

Updated 4 days ago • 194 • 4

mistralai/Mistral-Small-Instruct-2409

Updated Oct 16, 2024 • 123k • 379

mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated Dec 26, 2024 • 5 • 395

mlx-community/Mistral-Small-24B-Instruct-2501-8bit

Updated 17 days ago • 362 • 2

Karsh-CAI/Mistral-Small-24B-Instruct-2501-Q8_0-GGUF

Updated 17 days ago • 483 • 2

neuralmagic/DeepSeek-R1-Distill-Qwen-32B-quantized.w4a16

Text Generation • Updated 4 days ago • 422 • 2

neuralmagic/DeepSeek-R1-Distill-Qwen-32B-quantized.w8a8

Text Generation • Updated 4 days ago • 2.16k • 2

neuralmagic/DeepSeek-R1-Distill-Llama-70B-quantized.w4a16

Text Generation • Updated 4 days ago • 861 • 2

ZeroAgency/Zero-Mistral-Small-24B-Instruct-2501

Text Generation • Updated about 8 hours ago • 2

neuralmagic/Meta-Llama-3-8B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 12.3k • 21

neuralmagic/Meta-Llama-3-8B-Instruct-FP8-KV

Text Generation • Updated Jun 19, 2024 • 5.72k • 7

neuralmagic/Qwen2-72B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 440 • 12

neuralmagic/gemma-2-9b-it-FP8

Text Generation • Updated Jul 18, 2024 • 610 • 6

neuralmagic/Mistral-Nemo-Instruct-2407-FP8

Text Generation • Updated Jul 19, 2024 • 33.5k • 18

neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 153k • 38

neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8-dynamic

Text Generation • Updated Oct 19, 2024 • 2.01k • 6

neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8

Text Generation • Updated 6 days ago • 113k • 41

neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • Updated Oct 23, 2024 • 6.31k • 14