Edit model card

Gemma-2-2b-Instruct-GGUF Introduction

Gemma 2 Instruct is the latest addition to Google's Gemma family of lightweight, state-of-the-art open models. Built on Gemini technology, this 2 billion parameter model excels at various text generation tasks while being compact enough for edge and low-power computing environments.

Key Features

  • Based on Gemini technology
  • 2 billion parameters
  • Trained on 2 trillion tokens of web documents, code, and mathematics
  • Suitable for edge devices and low-power compute
  • Versatile for text generation, coding, and mathematical tasks
  • Retains the large vocabulary from Gemma 1.1 for enhanced multilingual and coding capabilities

Applications

Gemma 2 Instruct is designed for a wide range of applications, including:

  • Content creation
  • Chatbots and conversational AI
  • Text summarization
  • Code generation
  • Mathematical problem-solving

For more details check out their blog post here: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma/

Quantized GGUF Models Benchmark

Name Quant method Bits Size Use Cases
gemma-2-2b-it-Q2_K.gguf Q2_K 2 1.23 GB fast but high loss, not recommended
gemma-2-2b-it-Q3_K_S.gguf Q3_K_S 3 1.36 GB extremely not recommended
gemma-2-2b-it-Q3_K_M.gguf Q3_K_M 3 1.46 GB moderate loss, not very recommended
gemma-2-2b-it-Q3_K_L.gguf Q3_K_L 3 1.55 GB not very recommended
gemma-2-2b-it-Q4_0.gguf Q4_0 4 1.63 GB moderate speed, recommended
gemma-2-2b-it-Q4_1.gguf Q4_1 4 1.76 GB moderate speed, recommended
gemma-2-2b-it-Q4_K_S.gguf Q4_K_S 4 1.64 GB fast and accurate, very recommended
gemma-2-2b-it-Q4_K_M.gguf Q4_K_M 4 1.71 GB fast, recommended
gemma-2-2b-it-Q5_0.gguf Q5_0 5 1.88 GB fast, recommended
gemma-2-2b-it-Q5_1.gguf Q5_1 5 2.01 GB very big, prefer Q4
gemma-2-2b-it-Q5_K_S.gguf Q5_K_S 5 1.88 GB big, recommended
gemma-2-2b-it-Q5_K_M.gguf Q5_K_M 5 1.92 GB big, recommended
gemma-2-2b-it-Q6_K.gguf Q6_K 6 2.15 GB very big, not very recommended
gemma-2-2b-it-Q8_0.gguf Q8_0 8 2.78 GB very big, not very recommended
gemma-2-2b-it-F16.gguf F16 16 5.24 GB extremely big

Quantized with llama.cpp

Invitation to join our beta test to accelerate your on-device AI development

Sign up using this Link: https://forms.gle/vuoktPjPmotnT4sM7

We're excited to invite you to join our beta test for a new platform designed to enhance on-device AI development.

By participating, you'll have the opportunity to connect with fellow developers and researchers and contribute to the open-source future of on-device AI.

Here's how you can get involved:

  1. Sign an NDA.
  2. Receive a link to our beta testing community on Discord.
  3. Join a brief 15-minute online chat to share your valuable feedback.

Your insights are invaluable to us as we build this platform together.

Downloads last month
490
GGUF
Model size
2.61B params
Architecture
gemma2

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for NexaAIDev/gemma-2-2b-it-GGUF

Base model

google/gemma-2-2b
Quantized
(102)
this model

Space using NexaAIDev/gemma-2-2b-it-GGUF 1