metadata

base_model: microsoft/phi-2
library_name: peft
license: mit
tags:
  - generated_from_trainer
model-index:
  - name: peft-dialogue-summary-training-1722323475
    results: []

peft-dialogue-summary-training-1722323475

This model is a fine-tuned version of microsoft/phi-2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.3175

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 1000

Training results

Training Loss	Epoch	Step	Validation Loss
1.6386	0.0500	25	1.3931
1.1913	0.1001	50	1.3853
1.4455	0.1501	75	1.3551
1.2051	0.2001	100	1.3615
1.4377	0.2501	125	1.3421
1.1359	0.3002	150	1.3620
1.4014	0.3502	175	1.3393
1.1479	0.4002	200	1.3471
1.4439	0.4502	225	1.3337
1.2241	0.5003	250	1.3379
1.4578	0.5503	275	1.3327
1.1638	0.6003	300	1.3351
1.4283	0.6503	325	1.3301
1.2007	0.7004	350	1.3300
1.3962	0.7504	375	1.3276
1.1819	0.8004	400	1.3338
1.4409	0.8504	425	1.3272
1.196	0.9005	450	1.3281
1.4465	0.9505	475	1.3246
1.1907	1.0005	500	1.3272
1.4095	1.0505	525	1.3236
1.1451	1.1006	550	1.3245
1.346	1.1506	575	1.3230
1.1855	1.2006	600	1.3240
1.3688	1.2506	625	1.3220
1.1166	1.3007	650	1.3235
1.3762	1.3507	675	1.3210
1.1249	1.4007	700	1.3227
1.4183	1.4507	725	1.3205
1.1583	1.5008	750	1.3198
1.4363	1.5508	775	1.3190
1.1389	1.6008	800	1.3194
1.399	1.6508	825	1.3184
1.1295	1.7009	850	1.3195
1.451	1.7509	875	1.3179
1.143	1.8009	900	1.3177
1.3621	1.8509	925	1.3175
1.1494	1.9010	950	1.3176
1.3383	1.9510	975	1.3175
1.1089	2.0010	1000	1.3175

Framework versions

PEFT 0.12.0
Transformers 4.43.3
Pytorch 2.4.0
Datasets 2.20.0
Tokenizers 0.19.1