munish0838 commited on
Commit
2e47fe7
1 Parent(s): 4495248

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - it
4
+ license: apache-2.0
5
+ tags:
6
+ - text-generation-inference
7
+ - text generation
8
+ datasets:
9
+ - DeepMount00/llm_ita_ultra
10
+ pipeline_tag: text-generation
11
+ base_model: DeepMount00/Mistral-Ita-7b
12
+ ---
13
+
14
+ # QuantFactory/Mistral-Ita-7b-GGUF
15
+ This is quantized version of [DeepMount00/Mistral-Ita-7b](https://huggingface.co/DeepMount00/Mistral-Ita-7b) created using llama.cpp
16
+
17
+ # Model Description
18
+ ## Mistral-7B-v0.1 for Italian Language Text Generation
19
+
20
+ ## Model Architecture
21
+ - **Base Model:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
22
+ - **Specialization:** Italian Language
23
+
24
+ ## Evaluation
25
+
26
+ For a detailed comparison of model performance, check out the [Leaderboard for Italian Language Models](https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard).
27
+
28
+ Here's a breakdown of the performance metrics:
29
+
30
+ | Metric | hellaswag_it acc_norm | arc_it acc_norm | m_mmlu_it 5-shot acc | Average |
31
+ |:----------------------------|:----------------------|:----------------|:---------------------|:--------|
32
+ | **Accuracy Normalized** | 0.6731 | 0.5502 | 0.5364 | 0.5866 |
33
+
34
+ ---
35
+
36
+
37
+ **Quantized 4-Bit Version Available**
38
+
39
+ A quantized 4-bit version of the model is available for use. This version offers a more efficient processing capability by reducing the precision of the model's computations to 4 bits, which can lead to faster performance and decreased memory usage. This might be particularly useful for deploying the model on devices with limited computational power or memory resources.
40
+
41
+ For more details and to access the model, visit the following link: [Mistral-Ita-7b-GGUF 4-bit version](https://huggingface.co/DeepMount00/Mistral-Ita-7b-GGUF).
42
+
43
+ ---
44
+
45
+ ## How to Use
46
+ How to utilize my Mistral for Italian text generation
47
+
48
+ ```python
49
+ from transformers import AutoModelForCausalLM, AutoTokenizer
50
+ import torch
51
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
52
+
53
+ MODEL_NAME = "DeepMount00/Mistral-Ita-7b"
54
+
55
+ model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16).eval()
56
+ model.to(device)
57
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
58
+
59
+ def generate_answer(prompt):
60
+ messages = [
61
+ {"role": "user", "content": prompt},
62
+ ]
63
+ model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
64
+ generated_ids = model.generate(model_inputs, max_new_tokens=200, do_sample=True,
65
+ temperature=0.001, eos_token_id=tokenizer.eos_token_id)
66
+ decoded = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
67
+ return decoded[0]
68
+
69
+ prompt = "Come si apre un file json in python?"
70
+ answer = generate_answer(prompt)
71
+ print(answer)
72
+ ```
73
+ ---
74
+ ## Developer
75
+ [Michele Montebovi]