|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
This model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. |
|
This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset. |
|
Only 1000 data samples were used to train quickly using ORPO. |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
The base model meta-llama/Llama-3.2-1B has been fine-tuned using ORPO on a few samples of mlabonne/orpo-dpo-mix-40k dataset. |
|
The Llama 3.2 instruction-tuned text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. |
|
This fine-tuned version is aimed at improving the understanding of the context in prompts and thereby increasing the interpretability of the model. |
|
|
|
|
|
- **Finetuned from model [meta-llama/Llama-3.2-1B]** |
|
- **Model Size: 1 Billion parameters** |
|
- **Fine-tuning Method: ORPO** |
|
- **Dataset: mlabonne/orpo-dpo-mix-40k** |
|
|
|
## Evaluation |
|
|
|
The model was evaluated on the following benchmarks, with the following performance metrics: |
|
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| |
|
|---------|------:|------|-----:|--------|---|-----:|---|-----:| |
|
|hellaswag| 1|none | 0|acc |↑ |0.4772|± |0.0050| |
|
| | |none | 0|acc_norm|↑ |0.6366|± |0.0048| |
|
|tinyMMLU| 0|none | 0|acc_norm|↑ |0.4306|± | N/A| |
|
|eq_bench| 2.1|none | 0|eqbench |↑ |-12.9709|± |2.9658| |
|
| | |none | 0|percent_parseable|↑ | 92.9825|± |1.9592| |
|
|