Edit model card

GPT2 Fine Tuned Headline Generator

Model description

The model is fine-tuned for 2 epochs and 4k training samples from the abcnews dataset. This enables the model to generate news headline like text given a simple prompt

Intended uses & limitations

This model is only for learning purposes only. The model easily hallucinates people names, locations and other artifacts & incidents.

Training and evaluation data

The model leverages 2k test samples for evaluation

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 4
  • num_epochs: 2

Training results

The final output after 2 epochs is as follows: TrainOutput(global_step=130, training_loss=5.044873604407678, metrics={'train_runtime': 140.587, 'train_samples_per_second': 59.166, 'train_steps_per_second': 0.925, 'total_flos': 248723096358912.0, 'train_loss': 5.044873604407678, 'epoch': 2.0})

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.1
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
25
Safetensors
Model size
355M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for raghavbali/gpt2-finetuned-headliner

Finetuned
(94)
this model
Finetunes
2 models