base_model: silma-ai/SILMA-9B-Instruct-v1.0
datasets:
- MohammedNasser/ARabic_Reasoning_QA
language:
- ar
library_name: transformers
license: apache-2.0
metrics:
- accuracy
pipeline_tag: question-answering
SILMA-9B-Instruct Fine-Tuned for Arabic Reasoning-QA
This model is a fine-tuned version of silma-ai/SILMA-9B-Instruct-v1.0, optimized for Arabic Question Answering tasks. It excels at providing numerical answers to a wide range of questions in Arabic.
Model Descriptionen
This fine-tuned model is based on the silma-ai/SILMA-9B-Instruct-v1.0 and is designed to answer reasoning questions in Arabic, providing integer-based answers. The model has be fine-tuned using a custom Arabic Reasoning QA dataset, specifically tailored to handle questions ranging from easy to difficult across various topics.
Model Details
- Model Name: silma_9b_instruct_ft
- Model Type: Language Model
- Language: Arabic
- Base Model: silma-ai/SILMA-9B-Instruct-v1.0
- Fine-Tuning Method: PEFT with LoraConfig
- Task: Arabic Question Answering (Numerical Responses)
- Training Data: Custom Arabic Reasoning QA dataset
- Quantization: 4-bit quantization using bitsandbytes
Features
- Optimized for Arabic language understanding and generation
- Specialized in providing numerical answers to questions
- Efficient inference with 4-bit quantization
- Fine-tuned using PEFT with LoraConfig for parameter-efficient training
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.1356 | 0.04 | 10 | 1.4071 |
0.8079 | 0.08 | 20 | 0.2825 |
0.1592 | 0.12 | 30 | 0.1427 |
0.1202 | 0.16 | 40 | 0.1121 |
0.1095 | 0.2 | 50 | 0.1071 |
0.1024 | 0.24 | 60 | 0.1036 |
0.0993 | 0.28 | 70 | 0.1002 |
0.091 | 0.32 | 80 | 0.0992 |
0.1096 | 0.36 | 90 | 0.0965 |
0.0943 | 0.4 | 100 | 0.0916 |
0.0882 | 0.44 | 110 | 0.0896 |
0.0853 | 0.48 | 120 | 0.0848 |
0.0767 | 0.52 | 130 | 0.0808 |
0.0778 | 0.56 | 140 | 0.0765 |
0.0698 | 0.6 | 150 | 0.0734 |
0.0784 | 0.64 | 160 | 0.0694 |
0.0648 | 0.68 | 170 | 0.0658 |
0.0797 | 0.72 | 180 | 0.0630 |
0.0591 | 0.76 | 190 | 0.0604 |
0.0557 | 0.8 | 200 | 0.0582 |
0.0567 | 0.84 | 210 | 0.0561 |
0.057 | 0.88 | 220 | 0.0534 |
0.0505 | 0.92 | 230 | 0.0515 |
0.0483 | 0.96 | 240 | 0.0482 |
0.0463 | 1.0 | 250 | 0.0463 |
Training Metrics
Usage
Here's a quick example of how to use the model:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
model_name = "MohammedNasser/silma_9b_instruct_ft"
user_question = "إذا كان لديك ثلاث سيارات، وبعت واحدة منها، كم سيارة ستبقى لديك؟"
# Load model and tokenizer
import torch
from transformers import pipeline
# Create pipeline
pipe = pipeline(
"text-generation",
model=model_name,
torch_dtype= torch.bfloat16,
device="cuda",
return_full_text=False,
)
messages = [
{"role": "user", "content": user_question },
]
# Example usage
response = pipe(messages, max_new_tokens=128)
assistant_response = outputs[0]["generated_text"]
print(f"Question: {user_question}")
print(f"Answer: {assistant_response}")
Performance
Our model demonstrates strong performance on Arabic QA tasks, particularly for questions requiring numerical answers. Here are some key metrics:
- Eval Loss: 0.046
Limitations
- The model is optimized for numerical answers and may not perform as well on open-ended questions.
- Performance may vary for dialects or regional variations of Arabic not well-represented in the training data.
- The model may occasionally generate incorrect numerical answers for very complex or ambiguous questions.
Fine-tuning Details
The model was fine-tuned using the following configuration:
- LoRA Config:
- Alpha: 16
- Dropout: 0.1
- R: 4
- Training Hyperparameters:
- Batch Size: 4
- Learning Rate: 2e-4
- Epochs: 3
- Hardware: 4 x NVIDIA A100 GPUs
Citation
If you use this model in your research, please cite:
@misc {gaber_2024,
author = { {Gaber} },
title = { silma_9b_instruct_ft (Revision e54c562) },
year = 2024,
url = { https://huggingface.co/MohammedNasser/silma_9b_instruct_ft },
doi = { 10.57967/hf/3032 },
publisher = { Hugging Face }
}
Made with ❤️ by [M. N. Gaber/aiNarabic]