Michaelj1
/

distilgpt2-finetuned

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

distilgpt2-finetuned

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.6391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
4.0748	0.0436	50	3.8923
3.8414	0.0871	100	3.8125
3.8957	0.1307	150	3.7769
3.8723	0.1743	200	3.7545
4.0205	0.2179	250	3.7336
3.7175	0.2614	300	3.7282
3.7778	0.3050	350	3.7111
3.7763	0.3486	400	3.6994
3.8142	0.3922	450	3.6945
3.7654	0.4357	500	3.6831
3.9636	0.4793	550	3.6773
3.703	0.5229	600	3.6692
3.6114	0.5664	650	3.6647
3.6269	0.6100	700	3.6591
3.693	0.6536	750	3.6564
3.7969	0.6972	800	3.6529
3.6011	0.7407	850	3.6491
3.4943	0.7843	900	3.6466
3.7543	0.8279	950	3.6440
3.861	0.8715	1000	3.6406
3.5354	0.9150	1050	3.6401
3.6661	0.9586	1100	3.6396

Framework versions

Transformers 4.45.1
Pytorch 2.4.0
Datasets 3.0.1
Tokenizers 0.20.0

Downloads last month: 161

Safetensors

Model size

81.9M params

Tensor type

F32

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for Michaelj1/distilgpt2-finetuned

Base model

distilbert/distilgpt2

Finetuned

(626)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard