|
--- |
|
base_model: checkpoint_global_step_200000 |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: NoteChat-BioBART |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fine-tuning-large-language-model/huggingface/runs/avtpaj92) |
|
# NoteChat-BioBART |
|
|
|
This model is a fine-tuned version of [checkpoint_global_step_200000](https://huggingface.co/checkpoint_global_step_200000) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 2.0020 |
|
- Rouge1: 0.0816 |
|
- Rouge2: 0.0373 |
|
- Rougel: 0.0711 |
|
- Rougelsum: 0.074 |
|
- Gen Len: 20.0 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 4 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 20 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
|
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:| |
|
| 3.6113 | 1.0 | 3726 | 3.5345 | 0.0729 | 0.0245 | 0.0639 | 0.0667 | 20.0 | |
|
| 3.2014 | 2.0 | 7452 | 3.1318 | 0.0701 | 0.0228 | 0.0619 | 0.0643 | 20.0 | |
|
| 2.9394 | 3.0 | 11178 | 2.8865 | 0.0733 | 0.0255 | 0.0653 | 0.0678 | 20.0 | |
|
| 2.7238 | 4.0 | 14904 | 2.6827 | 0.0759 | 0.0291 | 0.0673 | 0.0695 | 20.0 | |
|
| 2.5805 | 5.0 | 18630 | 2.5151 | 0.0774 | 0.0311 | 0.0673 | 0.07 | 20.0 | |
|
| 2.4169 | 6.0 | 22356 | 2.3876 | 0.0799 | 0.0329 | 0.0686 | 0.0717 | 20.0 | |
|
| 2.2721 | 7.0 | 26082 | 2.2933 | 0.081 | 0.0345 | 0.0706 | 0.0734 | 20.0 | |
|
| 2.207 | 8.0 | 29808 | 2.2572 | 0.0812 | 0.035 | 0.071 | 0.0737 | 20.0 | |
|
| 2.1144 | 9.0 | 33534 | 2.1707 | 0.081 | 0.0352 | 0.0706 | 0.0735 | 20.0 | |
|
| 2.0559 | 10.0 | 37260 | 2.1287 | 0.0814 | 0.0351 | 0.0694 | 0.0728 | 20.0 | |
|
| 1.9991 | 11.0 | 40986 | 2.0978 | 0.081 | 0.0356 | 0.0705 | 0.0734 | 20.0 | |
|
| 1.9552 | 12.0 | 44712 | 2.0716 | 0.0812 | 0.0362 | 0.0709 | 0.0737 | 20.0 | |
|
| 1.9006 | 13.0 | 48438 | 2.0657 | 0.081 | 0.0364 | 0.0711 | 0.0739 | 20.0 | |
|
| 1.8592 | 14.0 | 52164 | 2.0483 | 0.0812 | 0.0362 | 0.0704 | 0.0734 | 20.0 | |
|
| 1.8453 | 15.0 | 55890 | 2.0314 | 0.0815 | 0.0375 | 0.0716 | 0.0744 | 20.0 | |
|
| 1.8113 | 16.0 | 59616 | 2.0129 | 0.081 | 0.0367 | 0.0708 | 0.0735 | 20.0 | |
|
| 1.7864 | 17.0 | 63342 | 2.0055 | 0.0815 | 0.0371 | 0.0711 | 0.074 | 20.0 | |
|
| 1.781 | 18.0 | 67068 | 2.0136 | 0.0809 | 0.0368 | 0.0708 | 0.0737 | 20.0 | |
|
| 1.7774 | 19.0 | 70794 | 2.0024 | 0.0815 | 0.0372 | 0.071 | 0.0739 | 20.0 | |
|
| 1.7345 | 20.0 | 74520 | 2.0020 | 0.0816 | 0.0373 | 0.0711 | 0.074 | 20.0 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.41.0 |
|
- Pytorch 2.1.2 |
|
- Datasets 2.20.0 |
|
- Tokenizers 0.19.1 |
|
|