m-polignano-uniba
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -239,4 +239,17 @@ The model is an instruction-tuned version of [**Meta-Llama-3-8b-instruct**](http
|
|
239 |
This model aims to be the **multilingual base-model** to further fine-tune in the Italian environment.
|
240 |
|
241 |
|
242 |
-
The ๐**ANITA project**๐ *(**A**dvanced **N**atural-based interaction for the **ITA**lian language)* wants to provide Italian NLP researchers with an improved model the for Italian Language ๐ฎ๐น use cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
239 |
This model aims to be the **multilingual base-model** to further fine-tune in the Italian environment.
|
240 |
|
241 |
|
242 |
+
The ๐**ANITA project**๐ *(**A**dvanced **N**atural-based interaction for the **ITA**lian language)* wants to provide Italian NLP researchers with an improved model the for Italian Language ๐ฎ๐น use cases.
|
243 |
+
|
244 |
+
<hr>
|
245 |
+
|
246 |
+
**Model developers** Marco Polignano - University of Bari Aldo Moro, Italy
|
247 |
+
|
248 |
+
**Variations** The model release has been **supervised fine-tuning (SFT)** using **QLoRA** in the 4bit version, on a long list of instruction-based datasets. **ORPO** approach over the *mlabonne/orpo-dpo-mix-40k* dataset is used to align with human preferences for helpfulness and safety.
|
249 |
+
|
250 |
+
**Input** Models input text only.
|
251 |
+
|
252 |
+
**Output** Models generate text and code only.
|
253 |
+
|
254 |
+
**Model Architecture** *Llama 3 architecture*.
|
255 |
+
|