distilgpt2-finetune-acl22

This model is a fine-tuned version of distilgpt2 on the ACL-anthology-corpus dataset. It achieves the following results on the evaluation set:

  • Loss: 3.4835

Model description

We finetune the gpt2 LLM on the full-text from ACL-anthology-corpus

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 256
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
3.6676 1.0 9852 3.5623
3.5959 2.0 19704 3.4995
3.5719 3.0 29556 3.4835

Framework versions

  • Transformers 4.21.2
  • Pytorch 1.12.1
  • Datasets 2.4.0
  • Tokenizers 0.12.1

What can it do?

Write introductions/abstract

  • Prompt : Toward Annotator Group Bias in Crowdsourcing. Introduction
  • Generation : Toward Annotator Group Bias in Crowdsourcing. Introduction Online platforms for crowdsourcing have received increasing scrutiny in recent years as platforms for online data analytics require an additional layer of content that allows users to interact and be informed about their quality.
Downloads last month
52
Safetensors
Model size
88.2M params
Tensor type
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.