bmarie4i commited on
Commit
c34836a
·
1 Parent(s): 8d9f729

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -0
README.md CHANGED
@@ -1,3 +1,137 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ datasets:
4
+ - bertin-project/alpaca-spanish
5
+ language:
6
+ - es
7
  ---
8
+
9
+
10
+ # Model Card for Model ID
11
+
12
+ This model is the Llama-2-7b-hf fine-tuned with an adapter on the Spanish Alpaca dataset.
13
+
14
+ ## Model Details
15
+
16
+ ### Model Description
17
+
18
+ This is a Spanish chat model fine-tuned on a Spanish instruction dataset.
19
+
20
+ The model expect a prompt containing the instruction, with an option to add an input (see examples below).
21
+
22
+
23
+
24
+ - **Developed by:** 4i Intelligent Insights
25
+ - **Model type:** Chat model
26
+ - **Language(s) (NLP):** Spanish
27
+ - **License:** cc-by-nc-4.0 (inhereted from the alpaca-spanish dataset),
28
+ - **Finetuned from model :** Llama 2 7B ([license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/))
29
+
30
+
31
+ ## Uses
32
+
33
+ The model is intended to be used directly without the need of further fine-tuning.
34
+
35
+
36
+ ## Bias, Risks, and Limitations
37
+
38
+ This model inherits the bias, risks, and limitations of its base model, Llama 2, and of the dataset used for fine-tuning.
39
+ Note that the Spanish Alpaca dataset was obtained by translating the original Alpaca dataset. It contains translation errors that may have negatively impacted the fine-tuning of the model.
40
+
41
+
42
+
43
+ ## How to Get Started with the Model
44
+
45
+ Use the code below to get started with the model for inference. The adapter was directly merged into the original Llama 2 model.
46
+ If you wish to only download the adapter, you will find it in the model repository's subdirectory.
47
+
48
+ The following code sample uses 4-bit quantization, you may load the model without it if you have enough VRAM.
49
+
50
+ ```py
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM TrainingArguments, GenerationConfig
52
+ model_name = "4i-ai/Spanish-Llama-2-7b"
53
+
54
+
55
+ #Tokenizer
56
+
57
+ tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
58
+
59
+ def create_and_prepare_model():
60
+ compute_dtype = getattr(torch, "float16")
61
+ bnb_config = BitsAndBytesConfig(
62
+ load_in_4bit=True,
63
+ bnb_4bit_quant_type="nf4",
64
+ bnb_4bit_compute_dtype=compute_dtype,
65
+ bnb_4bit_use_double_quant=True,
66
+ )
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ model_name, quantization_config=bnb_config, device_map={"": 0}
69
+ )
70
+ return model
71
+ model = create_and_prepare_model()
72
+
73
+ def generate(instruction, input=None):
74
+ #Format the prompt to look like the training data
75
+ if input is not None:
76
+ prompt = "### Instruction:\n"+instruction+"\n\n### Input:\n"+input+"\n\n### Response:\n"
77
+ else :
78
+ prompt = "### Instruction:\n"+instruction+"\n\n### Response:\n"
79
+
80
+
81
+ inputs = tokenizer(prompt, return_tensors="pt")
82
+ input_ids = inputs["input_ids"].cuda()
83
+
84
+ generation_output = model.generate(
85
+ input_ids=input_ids,
86
+ generation_config=GenerationConfig(temperature=1.0, top_p=0.75, top_k=40, num_beams=10), #hyperparameters for generation
87
+ return_dict_in_generate=True,
88
+ output_scores=True,
89
+ max_new_tokens=150, #maximum tokens generated, increase if you want longer asnwer (up to 2048 - the length of the prompt), generation "looks" slower for longer response
90
+
91
+ )
92
+ for seq in generation_output.sequences:
93
+ output = tokenizer.decode(seq, skip_special_tokens=True)
94
+ print(output.split("### Response:")[1].strip())
95
+
96
+ generate("Háblame de la superconductividad.")
97
+ print("-----------")
98
+ generate("Encuentra la capital de España.")
99
+ print("-----------")
100
+ generate("Encuentra la capital de Portugal.")
101
+ print("-----------")
102
+ generate("Organiza los números dados en orden ascendente.", "2, 3, 0, 8, 4, 10")
103
+ print("-----------")
104
+ generate("Compila una lista de 5 estados de EE. UU. ubicados en el Oeste.")
105
+ print("-----------")
106
+ generate("¿Cuál es la color de una fresa?")
107
+ print("-----------")
108
+ generate("¿Cuál es la color de la siguiente fruta?", "fresa")
109
+ print("-----------")
110
+
111
+ ```
112
+
113
+ Expected output:
114
+
115
+ ```
116
+ La superconductividad es un fenómeno físico en el que algunos materiales se convierten en conductores de corriente eléctrica a temperaturas muy bajas. Esto significa que la corriente eléctrica puede fluir a través del material sin pérdida de energía. La superconductividad fue descubierta por primera vez en 1911 por el físico alemán Heike Kamerlingh Onnes, quien descubrió que algunos materiales se convierten en conductores de corriente eléctrica a temperaturas muy bajas. Desde entonces, la superconductividad se ha utiliz
117
+ -----------
118
+ La capital de España es Madrid.
119
+ -----------
120
+ La capital de Portugal es Lisboa.
121
+ -----------
122
+ 2, 3, 4, 8, 10, 0
123
+ -----------
124
+ California, Oregón, Washington, Nevada y Arizona.
125
+ -----------
126
+ La color de una fresa es rosa.
127
+ -----------
128
+ La fresa es de color rosa.
129
+ ```
130
+
131
+
132
+
133
+
134
+
135
+ ## Model Card Contact
136
+
137
+ info@4i.ai