--- base_model: silma-ai/SILMA-9B-Instruct-v1.0 datasets: - MohammedNasser/ARabic_Reasoning_QA language: - ar library_name: transformers license: apache-2.0 metrics: - accuracy pipeline_tag: question-answering --- # SILMA-9B-Instruct Fine-Tuned for Arabic Reasoning-QA [![Generic badge](https://img.shields.io/badge/🤗-Hugging%20Face-blue.svg)](https://huggingface.co/MohammedNasser/silma_9b_instruct_ft) [![License: Apache](https://img.shields.io/badge/License-Apache-yellow.svg)](https://opensource.org/licenses/Apache) [![Python 3.9+](https://img.shields.io/badge/python-3.9+-red.svg)](https://www.python.org/downloads/release/python-390/) This model is a fine-tuned version of [silma-ai/SILMA-9B-Instruct-v1.0](https://huggingface.co/silma-ai/SILMA-9B-Instruct-v1.0), optimized for Arabic Question Answering tasks. It excels at providing numerical answers to a wide range of questions in Arabic. ## Model Descriptionen This fine-tuned model is based on the silma-ai/SILMA-9B-Instruct-v1.0 and is designed to answer reasoning questions in Arabic, providing integer-based answers. The model has be fine-tuned using a custom Arabic Reasoning QA dataset, specifically tailored to handle questions ranging from easy to difficult across various topics. ## Model Details - **Model Name**: silma_9b_instruct_ft - **Model Type**: Language Model - **Language**: Arabic - **Base Model**: silma-ai/SILMA-9B-Instruct-v1.0 - **Fine-Tuning Method**: PEFT with LoraConfig - **Task**: Arabic Question Answering (Numerical Responses) - **Training Data**: [Custom Arabic Reasoning QA dataset](https://huggingface.co/MohammedNasser/ARabic_Reasoning_QA) - **Quantization**: 4-bit quantization using bitsandbytes ## Features - Optimized for Arabic language understanding and generation - Specialized in providing numerical answers to questions - Efficient inference with 4-bit quantization - Fine-tuned using PEFT with LoraConfig for parameter-efficient training ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.1356 | 0.04 | 10 | 1.4071 | | 0.8079 | 0.08 | 20 | 0.2825 | | 0.1592 | 0.12 | 30 | 0.1427 | | 0.1202 | 0.16 | 40 | 0.1121 | | 0.1095 | 0.2 | 50 | 0.1071 | | 0.1024 | 0.24 | 60 | 0.1036 | | 0.0993 | 0.28 | 70 | 0.1002 | | 0.091 | 0.32 | 80 | 0.0992 | | 0.1096 | 0.36 | 90 | 0.0965 | | 0.0943 | 0.4 | 100 | 0.0916 | | 0.0882 | 0.44 | 110 | 0.0896 | | 0.0853 | 0.48 | 120 | 0.0848 | | 0.0767 | 0.52 | 130 | 0.0808 | | 0.0778 | 0.56 | 140 | 0.0765 | | 0.0698 | 0.6 | 150 | 0.0734 | | 0.0784 | 0.64 | 160 | 0.0694 | | 0.0648 | 0.68 | 170 | 0.0658 | | 0.0797 | 0.72 | 180 | 0.0630 | | 0.0591 | 0.76 | 190 | 0.0604 | | 0.0557 | 0.8 | 200 | 0.0582 | | 0.0567 | 0.84 | 210 | 0.0561 | | 0.057 | 0.88 | 220 | 0.0534 | | 0.0505 | 0.92 | 230 | 0.0515 | | 0.0483 | 0.96 | 240 | 0.0482 | | 0.0463 | 1.0 | 250 | 0.0463 | ### Training Metrics [Training Loss on wandb 🔗](https://wandb.ai/mohnasgbr/huggingface/reports/train-loss-24-09-07-03-41-58---Vmlldzo5MjgxMTY4) ## Usage Here's a quick example of how to use the model: ```python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch model_name = "MohammedNasser/silma_9b_instruct_ft" user_question = "إذا كان لديك ثلاث سيارات، وبعت واحدة منها، كم سيارة ستبقى لديك؟" # Load model and tokenizer import torch from transformers import pipeline # Create pipeline pipe = pipeline( "text-generation", model=model_name, torch_dtype= torch.bfloat16, device="cuda", return_full_text=False, ) messages = [ {"role": "user", "content": user_question }, ] # Example usage response = pipe(messages, max_new_tokens=128) assistant_response = outputs[0]["generated_text"] print(f"Question: {user_question}") print(f"Answer: {assistant_response}") ``` ## Performance Our model demonstrates strong performance on Arabic QA tasks, particularly for questions requiring numerical answers. Here are some key metrics: - **Eval Loss**: 0.046 ## Limitations - The model is optimized for numerical answers and may not perform as well on open-ended questions. - Performance may vary for dialects or regional variations of Arabic not well-represented in the training data. - The model may occasionally generate incorrect numerical answers for very complex or ambiguous questions. ## Fine-tuning Details The model was fine-tuned using the following configuration: - **LoRA Config**: - Alpha: 16 - Dropout: 0.1 - R: 4 - **Training Hyperparameters**: - Batch Size: 4 - Learning Rate: 2e-4 - Epochs: 3 - **Hardware**: 4 x NVIDIA A100 GPUs ## Citation If you use this model in your research, please cite: ```bibtex @misc {gaber_2024, author = { {Gaber} }, title = { silma_9b_instruct_ft (Revision e54c562) }, year = 2024, url = { https://huggingface.co/MohammedNasser/silma_9b_instruct_ft }, doi = { 10.57967/hf/3032 }, publisher = { Hugging Face } } ``` Made with ❤️ by [M. N. Gaber/aiNarabic]