--- license: apache-2.0 language: - sw base_model: - google/gemma-2-2b-it pipeline_tag: text-generation library_name: transformers tags: - swahili - gemma2 - text-generation-inference - text-generation inference: parameters: temperature: 0.7 top_p: 0.95 max_new_tokens: 500 do_sample: true eval_mode: true model_kwargs: eval_mode: true --- # Gemma2-2B-Swahili-IT Gemma2-2B-Swahili-IT is a lightweight, efficient open variant of Google's Gemma2-2B-IT model, fine-tuned for natural Swahili language understanding and generation. This model provides a resource-efficient option for Swahili language tasks while maintaining strong performance. ## Model Details - **Developer:** Alfaxad Eyembe - **Base Model:** google/gemma-2-2b-it - **Model Type:** Decoder-only transformer - **Language(s):** Swahili - **License:** Apache 2.0 - **Finetuning Approach:** Low-Rank Adaptation (LoRA) ## Training Data The model was fine-tuned on a comprehensive dataset containing: - 67,017 instruction-response pairs - 16,273,709 total tokens - Average 242.83 tokens per example - High-quality, naturally-written Swahili content ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6375af60e3413701a9f01c0f/7XXsvi8_x5PXZwXcUD-kl.png) ## Performance ### Massive Multitask Language Understanding (MMLU) - Swahili - Base Model: 31.58% accuracy - Fine-tuned Model: 38.60% accuracy - Improvement: +7.02% ### Sentiment Analysis - Base Model: 84.85% accuracy - Fine-tuned Model: 86.00% accuracy - Improvement: +1.15% - Response Validity: 100% ## Intended Use This model is designed for: - Basic Swahili text generation - Question answering - Sentiment analysis - Simple creative writing - General instruction following in Swahili - Resource-constrained environments ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained("alfaxadeyembe/gemma2-2b-swahili-it") model = AutoModelForCausalLM.from_pretrained( "alfaxadeyembe/gemma2-2b-swahili-it", device_map="auto", torch_dtype=torch.bfloat16 ) # Always set to eval mode for inference model.eval() # Example usage prompt = "Eleza dhana ya uchumi wa kidijitali na umuhimu wake katika ulimwengu wa leo." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=500, do_sample=True, temperature=0.7, top_p=0.95 ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Details - **Fine-tuning Method:** LoRA - **Training Steps:** 400 - **Batch Size:** 2 - **Gradient Accumulation Steps:** 32 - **Learning Rate:** 2e-4 - **Training Time:** ~8 hours on A100 GPU ## Key Features - Lightweight and efficient (2B parameters) - Suitable for resource-constrained environments - Good performance on basic language tasks - Fast inference speed - Low memory footprint ## Advantages 1. Resource Efficiency: - Small model size (2B parameters) - Lower memory requirements - Faster inference time - Suitable for deployment on less powerful hardware 2. Task Performance: - Strong sentiment analysis capabilities - Decent MMLU performance - Good instruction following - Natural Swahili generation ## Limitations - Simpler responses compared to 9B/27B variants ## Citation ```bibtex @misc{gemma2-2b-swahili-it, author = {Alfaxad Eyembe}, title = {Gemma2-2B-Swahili-IT: A Lightweight Swahili Variant of Gemma2-2B-IT}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face Model Hub}, } ``` ## Contact For questions or feedback, please reach out through: - HuggingFace: [@alfaxadeyembe](https://huggingface.co/alfaxad) - Twitter: [@alfxad](https://twitter.com/alfxad)