jmz2023's picture
jmz2023/t5_eli5_v1
b79572e verified
metadata
library_name: transformers
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
datasets:
  - eli5_category
metrics:
  - rouge
model-index:
  - name: flan-t5-base-finetuned-t5
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: eli5_category
          type: eli5_category
          config: default
          split: validation1
          args: default
        metrics:
          - name: Rouge1
            type: rouge
            value: 10.1877

flan-t5-base-finetuned-t5

This model is a fine-tuned version of google/flan-t5-base on the eli5_category dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Rouge1: 10.1877
  • Rouge2: 0.0
  • Rougel: 10.1808
  • Rougelsum: 10.1824
  • Gen Len: 9.366

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0 1.0 5736 nan 10.1877 0.0 10.1808 10.1824 9.366
0.0 2.0 11472 nan 10.1877 0.0 10.1808 10.1824 9.366
0.0 3.0 17208 nan 10.1877 0.0 10.1808 10.1824 9.366

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1