|
--- |
|
base_model: google/gemma-2-2b |
|
datasets: mlabonne/TheTome |
|
--- |
|
# Distil Gemma 2 2b |
|
|
|
This model is a gemma 2 2b model distilled from google/gemma-2-9b-it and finetuned on the tome. |
|
|
|
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/89XFihSa8o08wWw8w53uh.webp) |
|
|
|
## Prompt Template |
|
|
|
ChatML |
|
|
|
``` |
|
<|im_start|>system |
|
{system}<|im_end|> |
|
<|im_start|>user |
|
{user}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## Training Information |
|
|
|
This model trained on 8x Nvidia H100 NVL for the equivalent of 120 GPU hours. |
|
|
|
+ Loss Achieved: 0.27 |
|
+ Epochs: 3 |
|
|
|
Checkpoints are available in the repo to continue training |
|
## Evals |
|
|
|
IN PROGRESS |