Weyaxi commited on
Commit
a393e40
·
verified ·
1 Parent(s): 1a65662

model card

Browse files
Files changed (1) hide show
  1. README.md +86 -45
README.md CHANGED
@@ -104,11 +104,33 @@ model-index:
104
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct
105
  name: Open LLM Leaderboard
106
  ---
 
 
 
 
107
 
108
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
109
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
112
  <details><summary>See axolotl config</summary>
113
 
114
  axolotl version: `0.4.1`
@@ -191,62 +213,81 @@ save_safetensors: true
191
 
192
  </details><br>
193
 
194
- # Humanish-LLama3.1-8B-Instruct
195
 
196
- This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
197
 
198
- ## Model description
199
 
200
- More information needed
 
 
201
 
202
- ## Intended uses & limitations
 
203
 
204
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
205
 
206
- ## Training and evaluation data
 
 
 
 
207
 
208
- More information needed
209
 
210
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
211
 
212
- ### Training hyperparameters
213
 
214
- The following hyperparameters were used during training:
215
- - learning_rate: 0.0002
216
- - train_batch_size: 2
217
- - eval_batch_size: 8
218
- - seed: 42
219
- - distributed_type: multi-GPU
220
- - num_devices: 2
221
- - gradient_accumulation_steps: 8
222
- - total_train_batch_size: 32
223
- - total_eval_batch_size: 16
224
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
225
- - lr_scheduler_type: cosine
226
- - lr_scheduler_warmup_steps: 10
227
- - training_steps: 341
228
 
229
- ### Training results
230
 
 
 
231
 
 
232
 
233
- ### Framework versions
234
 
235
- - PEFT 0.13.0
236
- - Transformers 4.45.1
237
- - Pytorch 2.3.1+cu121
238
- - Datasets 2.21.0
239
- - Tokenizers 0.20.0
240
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
241
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HumanLLMs__Humanish-LLama3.1-8B-Instruct)
242
 
243
- | Metric |Value|
244
- |-------------------|----:|
245
- |Avg. |22.38|
246
- |IFEval (0-Shot) |64.98|
247
- |BBH (3-Shot) |28.01|
248
- |MATH Lvl 5 (4-Shot)| 8.46|
249
- |GPQA (0-shot) | 0.78|
250
- |MuSR (0-shot) | 2.00|
251
- |MMLU-PRO (5-shot) |30.02|
252
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct
105
  name: Open LLM Leaderboard
106
  ---
107
+ <div align="center">
108
+ <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
109
+ <h1>Enhancing Human-Like Responses in Large Language Models</h1>
110
+ </div>
111
 
112
+ <p align="center">
113
+ &nbsp&nbsp | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>&nbsp&nbsp |
114
+ &nbsp&nbsp 📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>&nbsp&nbsp |
115
+ &nbsp&nbsp 📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>&nbsp&nbsp |
116
+ </p>
117
+
118
+ # 🚀 Human-Like-Llama3-8B-Instruct
119
+
120
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instructt](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), specifically optimized to generate more human-like and conversational responses.
121
+
122
+ The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2501.05032) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
123
+
124
+ The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
125
+
126
+ # 🛠️ Training Configuration
127
+
128
+ - **Base Model:** Llama3-8B-Instruct
129
+ - **Framework:** Axolotl v0.4.1
130
+ - **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
131
+ - **Training Time:** ~2 hours 20 minutes
132
+ - **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
133
 
 
134
  <details><summary>See axolotl config</summary>
135
 
136
  axolotl version: `0.4.1`
 
213
 
214
  </details><br>
215
 
216
+ # 💬 Prompt Template
217
 
218
+ You can use Llama3 prompt template while using the model:
219
 
220
+ ### Llama3
221
 
222
+ ```
223
+ <|start_header_id|>system<|end_header_id|>
224
+ {system}<|eot_id|>
225
 
226
+ <|start_header_id|>user<|end_header_id|>
227
+ {user}<|eot_id|>
228
 
229
+ <|start_header_id|>assistant<|end_header_id|>
230
+ {assistant}<|eot_id|>
231
+ ```
232
+
233
+ This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
234
+ `tokenizer.apply_chat_template()` method:
235
+
236
+ ```python
237
+ messages = [
238
+ {"role": "system", "content": "You are helpful AI asistant."},
239
+ {"role": "user", "content": "Hello!"}
240
+ ]
241
+ gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
242
+ model.generate(**gen_input)
243
+ ```
244
+
245
+ # 🤖 Models
246
 
247
+ | Model | Download |
248
+ |:---------------------:|:-----------------------------------------------------------------------:|
249
+ | Human-Like-Llama-3-8B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct) |
250
+ | Human-Like-Qwen-2.5-7B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct) |
251
+ | Human-Like-Mistral-Nemo-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
252
 
253
+ # 🎯 Benchmark Results
254
 
255
+ | **Group** | **Model** | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
256
+ |--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
257
+ | **Llama Models** | Human-Like-Llama-3-8B-Instruct | 22.37 | **64.97** | 28.01 | 8.45 | 0.78 | **2.00** | 30.01 |
258
+ | | Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 |
259
+ | | *Difference (Human-Like)* | -1.20 | **-9.11** | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 |
260
+ | **Qwen Models** | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
261
+ | | Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 |
262
+ | | *Difference (Human-Like)* | -0.20 | -3.01 | -0.41 | 0.00 | **+1.01**| -0.03 | **+1.24** |
263
+ | **Mistral Models** | Human-Like-Mistral-Nemo-Instruct | 22.88 | **54.51** | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
264
+ | | Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 |
265
+ | | *Difference (Human-Like)* | -0.65 | **-9.29** | **+3.02**| **+1.73** | -0.34 | +0.91 | +0.03 |
266
 
 
267
 
268
+ # 📊 Dataset
 
 
 
 
 
 
 
 
 
 
 
 
 
269
 
270
+ The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
271
 
272
+ - **Human-like responses:** Natural, conversational answers mimicking human dialogue.
273
+ - **Formal responses:** Structured and precise answers with a more formal tone.
274
 
275
+ The dataset has been open-sourced and is available at:
276
 
277
+ - 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
278
 
279
+ More details on the dataset creation process can be found in the accompanying research paper.
 
 
 
 
 
 
280
 
281
+ # 📝 Citation
 
 
 
 
 
 
 
 
282
 
283
+ ```
284
+ @misc{çalık2025enhancinghumanlikeresponseslarge,
285
+ title={Enhancing Human-Like Responses in Large Language Models},
286
+ author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
287
+ year={2025},
288
+ eprint={2501.05032},
289
+ archivePrefix={arXiv},
290
+ primaryClass={cs.CL},
291
+ url={https://arxiv.org/abs/2501.05032},
292
+ }
293
+ ```