--- base_model: - mshojaei77/Gemma-2-2b-fa library_name: transformers tags: - mergekit - merge license: gemma datasets: - mshojaei77/Persian_sft language: - fa --- # Persian Gemma 2b (v2) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6556b1bb85d43542fa1a8f91/NOSivA-mvcG_DhVkJQY3p.png) This repository hosts **Persian Gemma 2b v2**, an optimized version of the initial Persian Gemma 2b model. This version has undergone further experimental improvements through techniques like **self-merging** to enhance its performance and efficiency for Persian language conversational tasks. This model builds upon the foundation of Google's Gemma-2-2b-it and the initial fine-tuning efforts of `mshojaei77/Gemma-2b-fa`. ## Key Improvements in v2 * **Optimized Performance:** This version incorporates techniques to improve the model's performance in generating Persian text and engaging in conversations. * **Self-Merged:** The model has been merged with itself, potentially leading to a more robust and coherent representation. **Important Note:** This is still an **experimental model** and is under active development. While optimizations have been applied, it's crucial to understand that it retains the limitations inherent in a 2 billion parameter model and the early-stage nature of this project. **Output quality may vary, and critical evaluation is still necessary.** ## How to Use You can use this model just like the original `mshojaei77/Gemma-2b-fa`, using the `transformers` library and the `pipeline`. ```python import torch from transformers import pipeline # Initialize the text generation pipeline pipe = pipeline( "text-generation", model="mshojaei77/gemma-2-2b-fa-v2", model_kwargs={"torch_dtype": torch.bfloat16}, device="cuda", # Or "mps" for Macs with Apple Silicon ) # Prepare input messages (using the gemma chat template implicitly) messages = [ {"role": "user", "content": "سلام چطوری؟"}, ] # Generate a response with a maximum of 512 new tokens outputs = pipe(messages, max_new_tokens=512, chat_template="gemma") # Explicitly using chat_template for clarity assistant_response = outputs[0]["generated_text"][-1]["content"].strip() print(assistant_response) # Example Output (Illustrative - Output quality may vary significantly): # سلام! من خوبم، ممنون. شما چطوری؟ 😊 ``` **Key Usage Notes:** * **`model="mshojaei77/gemma-2-2b-fa-v2"`:** Make sure to specify the correct model name (`v2`) in your code. * **`chat_template="gemma"`:** Continue to use the `gemma` chat template for proper formatting. * **Experimental Nature:** Expect variable output quality. This is still an experimental model. ## Model Details (v2) * **Base Model:** google/gemma-2-2b-it * **Previous Version:** mshojaei77/Gemma-2b-fa * **Optimization Techniques:** Self-Merging, Further Optimization. * **Architecture:** Gemma2ForCausalLM (same as base model) * **Model Size:** 2 billion parameters ## Intended Use This model is intended for: * **Research and Experimentation:** Exploring the impact of Self-Merging techniques on Persian Gemma models. * **Educational Purposes:** Demonstrating advanced fine-tuning and optimization methods. * **Community Development:** Contributing to the growing ecosystem of Persian language models. * **Prototyping (with caution):** For early-stage prototyping, acknowledging its experimental nature. ## Limitations While optimized, this model still has limitations: * **Experimental Stage:** Under ongoing development and refinement. * **2b Parameter Model:** Performance is limited by the model size compared to larger models. * **Variable Output Quality:** Output quality can still be inconsistent and require careful evaluation. * **Potential for Imperfections:** May still exhibit issues like fluency problems, factual inaccuracies, or biases. **Use this model responsibly and be aware of its experimental nature.** **Citation:** If you use this model in your research or applications, please cite it using the following DOI: [10.57967/hf/4772](https://doi.org/10.57967/hf/4772)