--- library_name: transformers license: llama3 datasets: - 2A2I/argilla-dpo-mix-7k-arabic language: - ar pipeline_tag: text-generation --- # 👳 Arabic ORPO LLAMA 3
## 👓 Story first This model is the a finetuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using [ORPO](https://github.com/xfactlab/orpo) on [2A2I/argilla-dpo-mix-7k-arabic](https://huggingface.co/datasets/2A2I/argilla-dpo-mix-7k-arabic). I wanted to try ORPO and see if it will better align a biased English model like **llama3** to the arabic language or it will faill. While the evaluations favour the base llama3 over my finetune, in practice i found my finetune was much better at spitting coherent (mostly correct) arabic text which i find interesting. I would encourage everyone to try out the model from [here](https://huggingface.co/spaces/MohamedRashad/Arabic-Chatbot-Arena) and share his insights with me ^^ ## 🤔 Evaluation and Results This result was made using [lighteval](https://github.com/huggingface/lighteval) with the __community|arabic_mmlu__ tasks. | Community | Llama-3-8B-Instruct | Arabic-ORPO-Llama-3-8B-Instrcut | |----------------------------------|---------------------|----------------------------------| | **All** | **0.348** | **0.317** | | Abstract Algebra | 0.310 | 0.230 | | Anatomy | 0.385 | 0.348 | | Astronomy | 0.388 | 0.316 | | Business Ethics | 0.480 | 0.370 | | Clinical Knowledge | 0.396 | 0.385 | | College Biology | 0.347 | 0.299 | | College Chemistry | 0.180 | 0.250 | | College Computer Science | 0.250 | 0.190 | | College Mathematics | 0.260 | 0.280 | | College Medicine | 0.231 | 0.249 | | College Physics | 0.225 | 0.216 | | Computer Security | 0.470 | 0.440 | | Conceptual Physics | 0.315 | 0.404 | | Econometrics | 0.263 | 0.272 | | Electrical Engineering | 0.414 | 0.359 | | Elementary Mathematics | 0.320 | 0.272 | | Formal Logic | 0.270 | 0.214 | | Global Facts | 0.320 | 0.320 | | High School Biology | 0.332 | 0.335 | | High School Chemistry | 0.256 | 0.296 | | High School Computer Science | 0.350 | 0.300 | | High School European History | 0.224 | 0.242 | | High School Geography | 0.323 | 0.364 | | High School Government & Politics| 0.352 | 0.285 | | High School Macroeconomics | 0.290 | 0.285 | | High School Mathematics | 0.237 | 0.278 | | High School Microeconomics | 0.231 | 0.273 | | High School Physics | 0.252 | 0.225 | | High School Psychology | 0.316 | 0.330 | | High School Statistics | 0.199 | 0.176 | | High School US History | 0.284 | 0.250 | | High School World History | 0.312 | 0.274 | | Human Aging | 0.369 | 0.430 | | Human Sexuality | 0.481 | 0.321 | | International Law | 0.603 | 0.405 | | Jurisprudence | 0.491 | 0.370 | | Logical Fallacies | 0.368 | 0.276 | | Machine Learning | 0.214 | 0.312 | | Management | 0.350 | 0.379 | | Marketing | 0.521 | 0.547 | | Medical Genetics | 0.320 | 0.330 | | Miscellaneous | 0.446 | 0.443 | | Moral Disputes | 0.422 | 0.306 | | Moral Scenarios | 0.248 | 0.241 | | Nutrition | 0.412 | 0.346 | | Philosophy | 0.408 | 0.328 | | Prehistory | 0.429 | 0.349 | | Professional Accounting | 0.344 | 0.273 | | Professional Law | 0.306 | 0.244 | | Professional Medicine | 0.228 | 0.206 | | Professional Psychology | 0.337 | 0.315 | | Public Relations | 0.391 | 0.373 | | Security Studies | 0.469 | 0.335 | | Sociology | 0.498 | 0.408 | | US Foreign Policy | 0.590 | 0.490 | | Virology | 0.422 | 0.416 | | World Religions | 0.404 | 0.304 | | Average (All Communities) | 0.348 | 0.317 |