khongtrunght
/

Qwen2-7B-Instruct-SPPO-Function-call-v2.9

Generated from Trainer

Model card Files Files and versions Community

Edit model card

Qwen2-7B-Instruct-SPPO-Function-call-v2.9

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1252
Rewards/chosen: -1.5039
Rewards/rejected: -10.1971
Rewards/accuracies: 0.9422
Rewards/margins: 8.6931
Logps/rejected: -449.7067
Logps/chosen: -174.9095
Logits/rejected: -1.1951
Logits/chosen: -1.2269

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 16
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.1559	0.1145	250	0.1735	0.3086	-3.7350	0.9220	4.0436	-320.4661	-138.6583	-1.2768	-1.3532
0.1173	0.2290	500	0.1548	-0.3630	-6.6435	0.9364	6.2805	-378.6355	-152.0903	-1.2741	-1.3395
0.0758	0.3436	750	0.1379	-0.8688	-7.9323	0.9277	7.0635	-404.4112	-162.2059	-1.2435	-1.2953
0.0512	0.4581	1000	0.1346	-1.5788	-9.3134	0.9393	7.7346	-432.0334	-176.4072	-1.2333	-1.2741
0.1124	0.5726	1250	0.1356	-1.4131	-9.8385	0.9364	8.4254	-442.5365	-173.0933	-1.2057	-1.2424
0.0488	0.6871	1500	0.1362	-1.8112	-10.5330	0.9306	8.7217	-456.4250	-181.0552	-1.1979	-1.2280
0.1131	0.8016	1750	0.1261	-1.5432	-10.1867	0.9335	8.6435	-449.5005	-175.6950	-1.1964	-1.2291
0.0988	0.9162	2000	0.1252	-1.5039	-10.1971	0.9422	8.6931	-449.7067	-174.9095	-1.1951	-1.2269

Framework versions

Transformers 4.44.0
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 2

Safetensors

Model size

7.62B params

Tensor type

BF16

·

Inference API

Unable to determine this model's library. Check the docs .

Evaluation results

Metadata error: specify a dataset to view leaderboard