---
library_name: transformers
tags: []
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
This model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer. 
This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset.
Only 1000 data samples were used to train quickly using ORPO.


## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

The base model meta-llama/Llama-3.2-1B has been fine-tuned using ORPO on a few samples of mlabonne/orpo-dpo-mix-40k dataset.
The Llama 3.2 instruction-tuned text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. 
This fine-tuned version is aimed at improving the understanding of the context in prompts and thereby increasing  the interpretability of the model.


- **Finetuned from model [meta-llama/Llama-3.2-1B]**
- **Model Size: 1 Billion parameters**
- **Fine-tuning Method: ORPO**
- **Dataset: mlabonne/orpo-dpo-mix-40k**

## Evaluation

The model was evaluated on the following benchmarks, with the following performance metrics:
|  Tasks  |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|hellaswag|      1|none  |     0|acc     |↑  |0.4772|±  |0.0050|
|         |       |none  |     0|acc_norm|↑  |0.6366|±  |0.0048|
|tinyMMLU|      0|none  |     0|acc_norm|↑  |0.4306|±  |   N/A|
|eq_bench|    2.1|none  |     0|eqbench          |↑  |-12.9709|±  |2.9658|
|        |       |none  |     0|percent_parseable|↑  | 92.9825|±  |1.9592|