Text Generation
Transformers
PyTorch
Italian
English
mistral
conversational
text-generation-inference
Inference Endpoints
galatolo commited on
Commit
9d71359
β€’
1 Parent(s): e3796bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -8
README.md CHANGED
@@ -9,23 +9,21 @@ language:
9
  - en
10
  pipeline_tag: text-generation
11
  ---
12
-
13
  # cerbero-7b Italian LLM πŸš€
14
 
15
- > πŸ“’ **cerbero-7b** is an **Italian Large Language Model** (LLM) with a large context length of **8192 tokens** which excels in linguistic benchmarks.
16
-
17
 
18
  <p align="center">
19
  <img width="300" height="300" src="./README.md.d/cerbero.png">
20
  </p>
21
 
22
- Built on **mistral-7b**, which outperforms Llama2 13B across all benchmarks and surpasses Llama1 34B in numerous metrics.
23
 
24
  **cerbero-7b** is specifically crafted to fill the void in Italy's AI landscape.
25
 
26
  A **cambrian explosion** of **Italian Language Models** is essential for building advanced AI architectures that can cater to the diverse needs of the population.
27
 
28
- **cerbero-7b**, alongside companions like [**Camoscio**](https://github.com/teelinsan/camoscio) and [**Fauno**](https://github.com/RSTLess-research/Fauno-Italian-LLM), aims to kick-start this revolution in Italy, ushering in an era where sophisticated **AI solutions** can seamlessly interact with and understand the intricacies of the **Italian language**, thereby empowering **innovation** across **industries** and fostering a deeper **connection** between **technology** and the **people** it serves.
29
 
30
  **cerbero-7b** is released under the **permissive** Apache 2.0 **license**, allowing **unrestricted usage**, even **for commercial applications**.
31
 
@@ -45,11 +43,11 @@ The name "Cerbero," inspired by the three-headed dog that guards the gates of th
45
  ## Training Details πŸš€
46
 
47
  cerbero-7b is **fully fine-tuned**, distinguishing itself from LORA or QLORA fine-tunes.
48
- The model is trained on an expansive Italian Large Language Model (LLM) using synthetic datasets generated through dynamic self-chat.
49
 
50
  ### Dataset Composition πŸ“Š
51
 
52
- We employed the [Fauno training dataset](https://github.com/RSTLess-research/Fauno-Italian-LLM). The training data covers a broad spectrum, incorporating:
53
 
54
  - **Medical Data:** Capturing nuances in medical language. 🩺
55
  - **Technical Content:** Extracted from Stack Overflow to enhance the model's understanding of technical discourse. πŸ’»
@@ -83,7 +81,7 @@ prompt = """Questa Γ¨ una conversazione tra un umano ed un assistente AI.
83
 
84
  input_ids = tokenizer(prompt, return_tensors='pt').input_ids
85
  with torch.no_grad():
86
- output_ids = model.generate(input_ids, max_new_tokens=1024)
87
 
88
  generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
89
  print(generated_text)
 
9
  - en
10
  pipeline_tag: text-generation
11
  ---
 
12
  # cerbero-7b Italian LLM πŸš€
13
 
14
+ > πŸ“’ **Cerbero-7b** is the first **100% Free** and Open Source **Italian Large Language Model** (LLM) ready to be used for **research** or **commercial applications**.
 
15
 
16
  <p align="center">
17
  <img width="300" height="300" src="./README.md.d/cerbero.png">
18
  </p>
19
 
20
+ Built on [**mistral-7b**](https://mistral.ai/news/announcing-mistral-7b/), which outperforms Llama2 13B across all benchmarks and surpasses Llama1 34B in numerous metrics.
21
 
22
  **cerbero-7b** is specifically crafted to fill the void in Italy's AI landscape.
23
 
24
  A **cambrian explosion** of **Italian Language Models** is essential for building advanced AI architectures that can cater to the diverse needs of the population.
25
 
26
+ **cerbero-7b**, alongside companions like [**Camoscio**](https://github.com/teelinsan/camoscio) and [**Fauno**](https://github.com/RSTLess-research/Fauno-Italian-LLM), aims to help **kick-start** this **revolution** in Italy, ushering in an era where sophisticated **AI solutions** can seamlessly interact with and understand the intricacies of the **Italian language**, thereby empowering **innovation** across **industries** and fostering a deeper **connection** between **technology** and the **people** it serves.
27
 
28
  **cerbero-7b** is released under the **permissive** Apache 2.0 **license**, allowing **unrestricted usage**, even **for commercial applications**.
29
 
 
43
  ## Training Details πŸš€
44
 
45
  cerbero-7b is **fully fine-tuned**, distinguishing itself from LORA or QLORA fine-tunes.
46
+ The model is trained on an expansive Italian Large Language Model (LLM) using synthetic datasets generated through dynamic self-chat on a large context window of **8192 tokens**
47
 
48
  ### Dataset Composition πŸ“Š
49
 
50
+ We employed a **refined** version of the [Fauno training dataset](https://github.com/RSTLess-research/Fauno-Italian-LLM). The training data covers a broad spectrum, incorporating:
51
 
52
  - **Medical Data:** Capturing nuances in medical language. 🩺
53
  - **Technical Content:** Extracted from Stack Overflow to enhance the model's understanding of technical discourse. πŸ’»
 
81
 
82
  input_ids = tokenizer(prompt, return_tensors='pt').input_ids
83
  with torch.no_grad():
84
+ output_ids = model.generate(input_ids, max_new_tokens=128)
85
 
86
  generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
87
  print(generated_text)