germanjke commited on
Commit
e0524af
1 Parent(s): 40dbcdd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -4
README.md CHANGED
@@ -11,19 +11,41 @@ language:
11
 
12
  T-lite-it-1.0 is a model built upon the Qwen 2.5 model family and incorporates both continual pre-training and alignment techniques.
13
 
14
- Detailed model card’s coming soon…
15
-
16
  ### 📚 Dataset
17
 
18
- Detailed model card’s coming soon…
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## 📊 Benchmarks
21
 
22
- Detailed model card’s coming soon…
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
 
25
  ## 👨‍💻 Examples of usage
26
 
 
27
 
28
  ```python
29
  from transformers import AutoTokenizer, AutoModelForCausalLM
@@ -84,4 +106,29 @@ Output:
84
  Который ведёт к будущему, светлому и новому.
85
  Машинное обученье — наш проводник,
86
  В этом мире, где технологии царят.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
  ```
 
11
 
12
  T-lite-it-1.0 is a model built upon the Qwen 2.5 model family and incorporates both continual pre-training and alignment techniques.
13
 
 
 
14
  ### 📚 Dataset
15
 
16
+ Pre-training Stage 1:
17
+ 100B tokens, consisting of diverse Russian data from Common Crawl, books, code, and proprietary datasets, mixed with re-played English data (English added as it is the primary language of the base model).
18
+
19
+ Pre-training Stage 2:
20
+ 40B tokens, a mix of instruction and pre-training data.
21
+
22
+ Supervised Fine-Tuning (SFT):
23
+ 1B tokens, a mix of diverse instruction data.
24
+
25
+ Preference Tuning:
26
+ 1B tokens, training the model to be helpful.
27
 
28
  ## 📊 Benchmarks
29
 
30
+ | Benchmark | T-lite-it-1.0 | Qwen-2.5-7B-Instruct | GigaChat Pro 1.0.26.15 | RuAdapt-Qwen-7B-Instruct-v1 | gemma-2-9b-it |
31
+ |------------------------------------------------|:-------------:|:--------------------:|:----------------------:|:---------------------------:|:--------------|
32
+ | [MERA](https://mera.a-ai.ru) | **0.552** | 0.482 | 0.512 | 0.468 | 0.505 |
33
+ | [MaMuRaMu](https://mera.a-ai.ru/ru/tasks/22) | **0.775** | 0.711 | 0.77 | 0.7 | 0.724 |
34
+ | ruMMLU-PRO | **0.497** | 0.481 | - | 0.448 | 0.405 |
35
+ | ruGSM8K | **0.856** | 0.832 | 0.752 | 0.795 | 0.823 |
36
+ | ruMATH | **0.679** | 0.671 | 0.418 | 0.607 | 0.473 |
37
+ | ruMBPP | **0.693** | 0.685 | 0.412 | 0.696 | 0.63 |
38
+ | [ruCodeEval](https://mera.a-ai.ru/ru/tasks/23) | 0.082 / 0.168 / 0.226 | 0.025 / 0.071 / 0.098 | 0.056 / 0.068 / 0.073 | 0.018 / 0.064 / 0.11 | **0.215 / 0.494 / 0.561** |
39
+ | Arena-Hard-Ru | **64.38** | 54.29 | - | 52.77 | 47.83 |
40
+ | MT Bench Ru | 7.87 | 7.33 | **8.21** | 7.62 | 7.4 |
41
+ | Alpaca Eval Ru | **39.61** | 25.61 | 29.83 | 28.43 | 36.87 |
42
+
43
+ Detailed evaluation results can be found in our [habr post](https://habr.com/ru/companies/tbank/articles/865582/)
44
 
45
 
46
  ## 👨‍💻 Examples of usage
47
 
48
+ ### HF Usage
49
 
50
  ```python
51
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
106
  Который ведёт к будущему, светлому и новому.
107
  Машинное обученье — наш проводник,
108
  В этом мире, где технологии царят.
109
+ ```
110
+
111
+ ### VLLM Usage
112
+
113
+ ```python
114
+ from transformers import AutoTokenizer
115
+ from vllm import LLM, SamplingParams
116
+
117
+ model_name = "t-tech/T-lite-it-1.0"
118
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
119
+ llm = LLM(model=model_name)
120
+ sampling_params = SamplingParams(temperature=0.3, max_tokens=8192)
121
+
122
+ prompt = "Напиши стих про машинное обучение"
123
+ messages = [
124
+ {"role": "system", "content": "Ты T-lite, виртуальный ассистент в Т-Технологии. Твоя задача - быть полезным диалоговым ассистентом."},
125
+ {"role": "user", "content": prompt}
126
+ ]
127
+
128
+ prompt_token_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
129
+
130
+ outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
131
+
132
+ generated_text = [output.outputs[0].text for output in outputs]
133
+ print(generated_text)
134
  ```