bhuvana-ak7
/

OrpoLlama-3.2-1B-V1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

OrpoLlama-3.2-1B-V1 / README.md

bhuvana-ak7's picture

Update README.md

ba25654 verified 26 days ago

|

history blame contribute delete

1.69 kB

	---
	library_name: transformers
	tags: []
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->
	This model is a fine-tuned version of meta-llama/Llama-3.2-1B, using ORPO (Optimized Regularization for Prompt Optimization) Trainer.
	This model is fine-tuned using the mlabonne/orpo-dpo-mix-40k dataset.
	Only 1000 data samples were used to train quickly using ORPO.


	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	The base model meta-llama/Llama-3.2-1B has been fine-tuned using ORPO on a few samples of mlabonne/orpo-dpo-mix-40k dataset.
	The Llama 3.2 instruction-tuned text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
	This fine-tuned version is aimed at improving the understanding of the context in prompts and thereby increasing the interpretability of the model.


	- Finetuned from model [meta-llama/Llama-3.2-1B]
	- Model Size: 1 Billion parameters
	- Fine-tuning Method: ORPO
	- Dataset: mlabonne/orpo-dpo-mix-40k

	## Evaluation

	The model was evaluated on the following benchmarks, with the following performance metrics:
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|---------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|-----:\|
	\|hellaswag\| 1\|none \| 0\|acc \|↑ \|0.4772\|± \|0.0050\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.6366\|± \|0.0048\|
	\|tinyMMLU\| 0\|none \| 0\|acc_norm\|↑ \|0.4306\|± \| N/A\|
	\|eq_bench\| 2.1\|none \| 0\|eqbench \|↑ \|-12.9709\|± \|2.9658\|
	\| \| \|none \| 0\|percent_parseable\|↑ \| 92.9825\|± \|1.9592\|