---
base_model:
- mshojaei77/Gemma-2-2b-fa
library_name: transformers
tags:
- mergekit
- merge
license: gemma
datasets:
- mshojaei77/Persian_sft
language:
- fa
---

# Persian Gemma 2b (v2)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png)

This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **self-merging** to enhance its performance and efficiency for Persian language conversational tasks.

This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`.

## Key Improvements in v2

* **Optimized Performance:**  This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations.
* **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation.

**Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project.  **Output quality may vary, and critical evaluation is still necessary.**

## How to Use

You can use this model just like the original `mshojaei77/Gemma-2b-fa`, using the `transformers` library and the `pipeline`.

```python
import torch
from transformers import pipeline

# Initialize the text generation pipeline
pipe = pipeline(
    "text-generation",
    model="mshojaei77/gemma-2-2b-fa-v2",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # Or "mps" for Macs with Apple Silicon
)

# Prepare input messages (using the gemma chat template implicitly)
messages = [
    {"role": "user", "content": "سلام چطوری؟"},
]

# Generate a response with a maximum of 512 new tokens
outputs = pipe(messages, max_new_tokens=512, chat_template="gemma") # Explicitly using chat_template for clarity
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()

print(assistant_response)
# Example Output (Illustrative - Output quality may vary significantly):
# سلام! من خوبم، ممنون. شما چطوری؟ 😊
```

**Key Usage Notes:**

* **`model="mshojaei77/gemma-2-2b-fa-v2"`:**  Make sure to specify the correct model name (`v2`) in your code.
* **`chat_template="gemma"`:**  Continue to use the `gemma` chat template for proper formatting.
* **Experimental Nature:** Expect variable output quality. This is still an experimental model.

## Model Details (v2)

* **Base Model:** google/gemma-2-2b-it
* **Previous Version:** mshojaei77/Gemma-2b-fa
* **Optimization Techniques:** Self-Merging, Further Optimization.
* **Architecture:** Gemma2ForCausalLM (same as base model)
* **Model Size:** 2 billion parameters

## Intended Use

This model is intended for:

* **Research and Experimentation:**  Exploring the impact of Self-Merging techniques on Persian Gemma models.
* **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods.
* **Community Development:**  Contributing to the growing ecosystem of Persian language models.
* **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature.

## Limitations

While optimized, this model still has limitations:

* **Experimental Stage:**  Under ongoing development and refinement.
* **2b Parameter Model:**  Performance is limited by the model size compared to larger models.
* **Variable Output Quality:** Output quality can still be inconsistent and require careful evaluation.
* **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases.

**Use this model responsibly and be aware of its experimental nature.**

**Citation:**

If you use this model in your research or applications, please cite it using the following DOI: [10.57967/hf/4772](https://doi.org/10.57967/hf/4772)