zeren commited on
Commit
61be248
1 Parent(s): 7e73e38

Update README.md

Browse files

model card updated

Files changed (1) hide show
  1. README.md +163 -86
README.md CHANGED
@@ -2,91 +2,168 @@
2
  license: llama3
3
  language:
4
  - tr
5
- - en
6
  model-index:
7
- - name: Kocdigital-LLM-8b-v0.1
8
- results:
9
- - task:
10
- type: text-generation
11
- name: Text Generation
12
- dataset:
13
- name: AI2 Reasoning Challenge TR
14
- type: ai2_arc
15
- config: ARC-Challenge
16
- split: test
17
- args:
18
- num_few_shot: 25
19
- metrics:
20
- - type: acc
21
- value: 44.03
22
- name: accuracy
23
- - task:
24
- type: text-generation
25
- name: Text Generation
26
- dataset:
27
- name: HellaSwag TR
28
- type: hellaswag
29
- split: validation
30
- args:
31
- num_few_shot: 10
32
- metrics:
33
- - type: acc
34
- value: 46.73
35
- name: accuracy
36
- - task:
37
- type: text-generation
38
- name: Text Generation
39
- dataset:
40
- name: MMLU TR
41
- type: cais/mmlu
42
- config: all
43
- split: test
44
- args:
45
- num_few_shot: 5
46
- metrics:
47
- - type: acc
48
- value: 49.11
49
- name: accuracy
50
- - task:
51
- type: text-generation
52
- name: Text Generation
53
- dataset:
54
- name: TruthfulQA TR
55
- type: truthful_qa
56
- config: multiple_choice
57
- split: validation
58
- args:
59
- num_few_shot: 0
60
- metrics:
61
- - type: acc
62
- name: accuracy
63
- value: 48.21
64
- - task:
65
- type: text-generation
66
- name: Text Generation
67
- dataset:
68
- name: Winogrande TR
69
- type: winogrande
70
- config: winogrande_xl
71
- split: validation
72
- args:
73
- num_few_shot: 10
74
- metrics:
75
- - type: acc
76
- value: 54.98
77
- name: accuracy
78
- - task:
79
- type: text-generation
80
- name: Text Generation
81
- dataset:
82
- name: GSM8k TR
83
- type: gsm8k
84
- config: main
85
- split: test
86
- args:
87
- num_few_shot: 5
88
- metrics:
89
- - type: acc
90
- value: 51.78
91
- name: accuracy
92
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: llama3
3
  language:
4
  - tr
 
5
  model-index:
6
+ - name: Kocdigital-LLM-8b-v0.1
7
+ results:
8
+ - task:
9
+ type: text-generation
10
+ name: Text Generation
11
+ dataset:
12
+ name: AI2 Reasoning Challenge TR
13
+ type: ai2_arc
14
+ config: ARC-Challenge
15
+ split: test
16
+ args:
17
+ num_few_shot: 25
18
+ metrics:
19
+ - type: acc
20
+ value: 44.03
21
+ name: accuracy
22
+ - task:
23
+ type: text-generation
24
+ name: Text Generation
25
+ dataset:
26
+ name: HellaSwag TR
27
+ type: hellaswag
28
+ split: validation
29
+ args:
30
+ num_few_shot: 10
31
+ metrics:
32
+ - type: acc
33
+ value: 46.73
34
+ name: accuracy
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: MMLU TR
40
+ type: cais/mmlu
41
+ config: all
42
+ split: test
43
+ args:
44
+ num_few_shot: 5
45
+ metrics:
46
+ - type: acc
47
+ value: 49.11
48
+ name: accuracy
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: TruthfulQA TR
54
+ type: truthful_qa
55
+ config: multiple_choice
56
+ split: validation
57
+ args:
58
+ num_few_shot: 0
59
+ metrics:
60
+ - type: acc
61
+ name: accuracy
62
+ value: 48.21
63
+ - task:
64
+ type: text-generation
65
+ name: Text Generation
66
+ dataset:
67
+ name: Winogrande TR
68
+ type: winogrande
69
+ config: winogrande_xl
70
+ split: validation
71
+ args:
72
+ num_few_shot: 10
73
+ metrics:
74
+ - type: acc
75
+ value: 54.98
76
+ name: accuracy
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: GSM8k TR
82
+ type: gsm8k
83
+ config: main
84
+ split: test
85
+ args:
86
+ num_few_shot: 5
87
+ metrics:
88
+ - type: acc
89
+ value: 51.78
90
+ name: accuracy
91
  ---
92
+
93
+ <img src="https://huggingface.co/KOCDIGITAL/Kocdigital-LLM-8b-v0.1/resolve/main/icon.jpeg"
94
+ alt="KOCDIGITAL LLM" width="420"/>
95
+
96
+ # Kocdigital-LLM-8b-v0.1
97
+
98
+ This model is an fine-tuned version of a Llama3 8b Large Language Model (LLM) for Turkish. It was trained on a high quality Turkish instruction sets created from various open-source and internal resources. Turkish Instruction dataset carefully annotated to carry out Turkish instructions in an accurate and organized manner. The training process involved using the QLORA method.
99
+
100
+ ## Model Details
101
+
102
+ - **Base Model**: Llama3 8B based LLM
103
+ - **Training Dataset**: High Quality Turkish instruction sets
104
+ - **Training Method**: SFT with QLORA
105
+
106
+ ### QLORA Fine-Tuning Configuration
107
+
108
+ - `lora_alpha`: 128
109
+ - `lora_dropout`: 0
110
+ - `r`: 64
111
+ - `target_modules`: "q_proj", "k_proj", "v_proj", "o_proj",
112
+ "gate_proj", "up_proj", "down_proj"
113
+ - `bias`: "none"
114
+
115
+ ## Usage Examples
116
+
117
+ ```python
118
+
119
+ from transformers import AutoModelForCausalLM, AutoTokenizer
120
+ tokenizer = AutoTokenizer.from_pretrained(
121
+ "KOCDIGITAL/Kocdigital-LLM-8b-v0.1",
122
+ max_seq_length=4096)
123
+ model = AutoModelForCausalLM.from_pretrained(
124
+ "KOCDIGITAL/Kocdigital-LLM-8b-v0.1",
125
+ load_in_4bit=True,
126
+ )
127
+
128
+ system = 'Sen Türkçe konuşan genel amaçlı bir asistansın. Her zaman kullanıcının verdiği talimatları doğru, kısa ve güzel bir gramer ile yerine getir.'
129
+
130
+ template = "{}\n\n###Talimat\n{}\n###Yanıt\n"
131
+ content = template.format(system, 'Türkiyenin 3 büyük ilini listeler misin.')
132
+
133
+ conv = []
134
+ conv.append({'role': 'user', 'content': content})
135
+ inputs = tokenizer.apply_chat_template(conv,
136
+ tokenize=False,
137
+ add_generation_prompt=True,
138
+ return_tensors="pt")
139
+
140
+ print(inputs)
141
+
142
+ inputs = tokenizer([inputs],
143
+ return_tensors = "pt",
144
+ add_special_tokens=False).to("cuda")
145
+
146
+ outputs = model.generate(**inputs,
147
+ max_new_tokens = 512,
148
+ use_cache = True,
149
+ do_sample = True,
150
+ top_k = 50,
151
+ top_p = 0.60,
152
+ temperature = 0.3,
153
+ repetition_penalty=1.1)
154
+
155
+ out_text = tokenizer.batch_decode(outputs)[0]
156
+ print(out_text)
157
+ ```
158
+
159
+ # [Open LLM Turkish Leaderboard v0.2 Evaluation Results]
160
+
161
+ | Metric | Value |
162
+ |---------------------------------|------:|
163
+ | Avg. | 49.11 |
164
+ | AI2 Reasoning Challenge_tr-v0.2 | 44.03 |
165
+ | HellaSwag_tr-v0.2 | 46.73 |
166
+ | MMLU_tr-v0.2 | 49.11 |
167
+ | TruthfulQA_tr-v0.2 | 48.51 |
168
+ | Winogrande _tr-v0.2 | 54.98 |
169
+ | GSM8k_tr-v0.2 | 51.78 |