cosimoiaia commited on
Commit
5f5bf8c
·
1 Parent(s): 3904ca4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -2
README.md CHANGED
@@ -4,5 +4,70 @@ datasets:
4
  - cosimoiaia/Loquace-102k
5
  language:
6
  - it
7
- library_name: adapter-transformers
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - cosimoiaia/Loquace-102k
5
  language:
6
  - it
7
+ pipeline_tag: conversational
8
+ tags:
9
+ - alpaca
10
+ - llama
11
+ - llm
12
+ - finetune
13
+ - Italian
14
+ - qlora
15
+ ---
16
+
17
+ Model Card for Loquace-410m
18
+
19
+ # 🇮🇹 Loquace 🇮🇹
20
+
21
+ An exclusively Italian speaking, instruction finetuned, Large Language model. 🇮🇹
22
+
23
+ ## Model Description
24
+
25
+ Loquace-410m is the smallest model of the Loquace family. It was trained using QLoRa on a large dataset of 102k question/answer pairs
26
+ exclusively in Italian.
27
+
28
+ The related code can be found at: https://github.com/cosimoiaia/Loquace
29
+
30
+ Loquace-410m is part of the big Loquace family:
31
+
32
+ https://huggingface.co/cosimoiaia/Loquace-70m - Based on pythia-70m
33
+ https://huggingface.co/cosimoiaia/Loquace-410m - Based on pythia-410m
34
+ https://huggingface.co/cosimoiaia/Loquace-7B - Based on Falcon-7B, the most performing model of it's class.
35
+ https://huggingface.co/cosimoiaia/Loquace-12B - Based on pythia-12B
36
+ https://huggingface.co/cosimoiaia/Loquace-20B - Based on gpt-neox-20B
37
+
38
+
39
+
40
+ ## Usage
41
+
42
+
43
+ ```python
44
+ from peft import PeftModel
45
+ from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
46
+
47
+ tokenizer = LLaMATokenizer.from_pretrained("cosimoiaia/Loquace-410m")
48
+ model = LLaMAForCausalLM.from_pretrained(
49
+ "cosimoiaia/Loquace-410m",
50
+ load_in_8bit=True,
51
+ device_map="auto",
52
+ )
53
+ ```
54
+
55
+
56
+ ## Training
57
+
58
+ Loquace-410m was trained on a conversational dataset comprising 102k question/answer pairs in Italian language.
59
+ The training data was constructed by putting together translations from the original alpaca Dataset and other sources like the OpenAssistant dataset.
60
+ The model was trained for only 3000 iterations and took 18 hours on a single RTX 3090, kindly provided by Genesis Cloud.
61
+
62
+ ## Limitations
63
+
64
+ - Loquace-410m may not handle complex or nuanced queries well and may struggle with ambiguous or poorly formatted inputs.
65
+ - The model may generate responses that are factually incorrect or nonsensical. It should be used with caution, and outputs should be carefully verified.
66
+ - The training data primarily consists of conversational examples and may not generalize well to other types of tasks or domains.
67
+
68
+ ## Dependencies
69
+
70
+ - PyTorch
71
+ - Transformers library by Hugging Face
72
+ - Bitsandbites
73
+ - QLoRa