cosimoiaia
commited on
Commit
•
23d920e
1
Parent(s):
ce8dac3
Update README.md
Browse files
README.md
CHANGED
@@ -20,6 +20,9 @@ Model Card for Loquace-7B
|
|
20 |
|
21 |
An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
|
22 |
|
|
|
|
|
|
|
23 |
## Model Description
|
24 |
|
25 |
Loquace-7B is the first 7B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
|
@@ -57,7 +60,7 @@ model = LLaMAForCausalLM.from_pretrained(
|
|
57 |
|
58 |
Loquace-7B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
|
59 |
The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
|
60 |
-
The model was trained for only 3000 iterations and took
|
61 |
|
62 |
## Limitations
|
63 |
|
|
|
20 |
|
21 |
An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
|
22 |
|
23 |
+
The Loquace Italian LLM models family was created as a proof-of-concept to evaluate on how different model sizes can be fine-tuned using QLoRa on an instruct dataset
|
24 |
+
of a specific language.
|
25 |
+
|
26 |
## Model Description
|
27 |
|
28 |
Loquace-7B is the first 7B italian Large Language Model trained using QLoRa on a large dataset of 102k question/answer pairs
|
|
|
60 |
|
61 |
Loquace-7B was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
|
62 |
The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
|
63 |
+
The model was trained for only 3000 iterations and took 16 hours on a single RTX 3090, kindly provided by Genesis Cloud. (https://gnsiscld.co/26qhlf)
|
64 |
|
65 |
## Limitations
|
66 |
|