sayhan
/

gemma-7b-GGUF-quantized

Text Generation

Model card Files Files and versions Community

gemma-7b-GGUF-quantized / README.md

sayhan's picture

Create README.md

ceab941 verified 9 months ago

|

history blame contribute delete

3.04 kB

	---
	base_model: google/gemma-7b
	language:
	- en
	pipeline_tag: text-generation
	license: other
	model_type: gemma
	library_name: transformers
	inference: false
	---
	![image/webp](https://cdn-uploads.huggingface.co/production/uploads/65aa2d4b356bf23b4a4da247/NQAvp6NRHlNILyWWFlrA7.webp)
	## Google Gemma 7B
	- Model creator: [Google](https://huggingface.co/google)
	- Original model: [gemma-7b-it](https://huggingface.co/google/gemma-7b)
	- [Terms of use](https://www.kaggle.com/models/google/gemma/license/consent)
	<!-- description start -->
	## Description
	This repo contains GGUF format model files for [Google's Gemma 7B](https://huggingface.co/google/gemma-7b)

	## Original model
	- Developed by: [Google](https://huggingface.co/google)

	### Description
	Gemma is a family of lightweight, state-of-the-art open models from Google,
	built from the same research and technology used to create the Gemini models.
	They are text-to-text, decoder-only large language models, available in English,
	with open weights, pre-trained variants, and instruction-tuned variants. Gemma
	models are well-suited for a variety of text generation tasks, including
	question answering, summarization, and reasoning. Their relatively small size
	makes it possible to deploy them in environments with limited resources such as
	a laptop, desktop or your own cloud infrastructure, democratizing access to
	state of the art AI models and helping foster innovation for everyone.

	## Quantizon types
	\| quantization method \| bits \| size \| description \| recommended \|
	\|---------------------\|------\|----------\|-----------------------------------------------------\|-------------\|
	\| Q2_K \| 2 \| 3.09 \| very small, very high quality loss \| ❌ \|
	\| Q3_K_S \| 3 \| 3.68 GB \| very small, high quality loss \| ❌ \|
	\| Q3_K_L \| 3 \| 4.4 GB \| small, substantial quality loss \| ❌ \|
	\| Q4_0 \| 4 \| 4.81 GB \| legacy; small, very high quality loss \| ❌ \|
	\| Q4_K_S \| 4 \| 4.84 GB \| medium, balanced quality \| ✅ \|
	\| Q4_K_M \| 4 \| 5.13 GB \| medium, balanced quality \| ✅ \|
	\| Q5_0 \| 5 \| 5.88 GB \| legacy; medium, balanced quality \| ❌ \|
	\| Q5_K_S \| 5 \| 5.88 GB \| large, low quality loss \| ✅ \|
	\| Q5_K_M \| 5 \| 6.04 GB \| large, very low quality loss \| ✅ \|
	\| Q6_K \| 6 \| 7.01 GB \| very large, extremely low quality loss \| ❌ \|
	\| Q8_0 \| 8 \| 9.08 GB \| very large, extremely low quality loss \| ❌ \|
	\| FP16 \| 16 \| 17.1 GB \| enormous, negligible quality loss \| ❌ \|

	## Usage
	You can use this model with the latest builds of LM Studio and llama.cpp.
	If you're new to the world of _large language models_, I recommend starting with LM Studio.
	<!-- description end -->