|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
base_model: unsloth/llama-3-8b-bnb-4bit |
|
datasets: |
|
- cognitivecomputations/samantha-data |
|
--- |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** ruslandev |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit |
|
|
|
This model is finetuned on the data of [Samantha](https://erichartford.com/meet-samantha). |
|
Prompt format is Alpaca. I used the same system prompt as the original Samantha. |
|
``` |
|
"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
{SYSTEM_PROMPT} |
|
|
|
### Input: |
|
{QUESTION} |
|
|
|
### Response: |
|
""" |
|
``` |
|
|
|
# Training |
|
|
|
[gptchain](https://github.com/RuslanPeresy/gptchain) framework has been used for training. |
|
|
|
## Training hyperparameters |
|
|
|
- learning_rate: 2e-4 |
|
- seed: 3407 |
|
- gradient_accumulation_steps: 4 |
|
- per_device_train_batch_size: 2 |
|
- optimizer: adamw_8bit |
|
- lr_scheduler_type: linear |
|
- warmup_steps: 5 |
|
- num_epochs: 2 |
|
- weight_decay: 0.01 |
|
|
|
## Training results |
|
|
|
|Training Loss | Epoch | Step | |
|
|--------------|-------|------| |
|
|2.0778 |0.0 |1 | |
|
|0.6255 |0.18 |120 | |
|
|0.6208 |0.94 |620 | |
|
|0.6244 |2.0 |1306 | |
|
|
|
2 epoch finetuning from llama-3-8b took 1 hour on a single A100 with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |