qwenvl-2B-cadica-stenosis-detect-scale4

This model is a fine-tuned version of AdaptLLM/biomed-Qwen2-VL-2B-Instruct on the CADICA狹窄分析選擇題scale4(TRAIN) and the CADICA狹窄分析千問定位題scale4(Train) datasets. It achieves the following results on the evaluation set:

Loss: 0.4146
Num Input Tokens Seen: 35706848

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 6
total_train_batch_size: 24
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
training_steps: 3400

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
1.3506	0.0129	50	1.1727	524648
0.7577	0.0258	100	0.7517	1049816
0.7018	0.0386	150	0.7310	1573376
0.7594	0.0515	200	0.7274	2099008
0.7346	0.0644	250	0.7182	2625352
0.6815	0.0773	300	0.7074	3149032
0.6932	0.0901	350	0.7045	3673152
0.7089	0.1030	400	0.6848	4197072
0.6732	0.1159	450	0.6526	4722528
0.6304	0.1288	500	0.6272	5246640
0.5596	0.1416	550	0.5833	5774880
0.5794	0.1545	600	0.6039	6299656
0.5959	0.1674	650	0.5542	6824008
0.5014	0.1803	700	0.5440	7347040
0.541	0.1931	750	0.5274	7870520
0.5291	0.2060	800	0.5219	8393696
0.5063	0.2189	850	0.5526	8919608
0.5251	0.2318	900	0.4960	9443616
0.5862	0.2447	950	0.5085	9970600
0.483	0.2575	1000	0.5305	10495592
0.4628	0.2704	1050	0.5040	11021656
0.4825	0.2833	1100	0.4799	11547008
0.4553	0.2962	1150	0.4538	12071088
0.454	0.3090	1200	0.4925	12596208
0.4725	0.3219	1250	0.4340	13125624
0.4794	0.3348	1300	0.4992	13650264
0.3994	0.3477	1350	0.4608	14173240
0.437	0.3605	1400	0.4662	14697136
0.4065	0.3734	1450	0.4395	15222568
0.4338	0.3863	1500	0.4548	15744848
0.3872	0.3992	1550	0.4417	16269896
0.3914	0.4121	1600	0.4658	16798768
0.3755	0.4249	1650	0.4727	17323232
0.3796	0.4378	1700	0.4555	17845600
0.3767	0.4507	1750	0.4234	18371928
0.4484	0.4636	1800	0.4194	18898888
0.3941	0.4764	1850	0.4510	19424688
0.2877	0.4893	1900	0.4512	19950480
0.365	0.5022	1950	0.4764	20476176
0.3814	0.5151	2000	0.5098	21002032
0.3389	0.5279	2050	0.4328	21527496
0.3983	0.5408	2100	0.4818	22050376
0.3687	0.5537	2150	0.4505	22574104
0.3232	0.5666	2200	0.4517	23101488
0.325	0.5794	2250	0.4991	23626952
0.3322	0.5923	2300	0.4960	24154624
0.3651	0.6052	2350	0.4146	24678768
0.3445	0.6181	2400	0.4281	25206168
0.3413	0.6310	2450	0.4691	25730432
0.363	0.6438	2500	0.4471	26252584
0.3195	0.6567	2550	0.4373	26776816
0.3075	0.6696	2600	0.4505	27301776
0.324	0.6825	2650	0.5080	27827480
0.3076	0.6953	2700	0.4338	28352368
0.2817	0.7082	2750	0.4440	28877960
0.3567	0.7211	2800	0.4282	29402040
0.3024	0.7340	2850	0.4704	29928032
0.3167	0.7468	2900	0.4632	30455480
0.2899	0.7597	2950	0.4720	30979312
0.3522	0.7726	3000	0.4726	31503344
0.3137	0.7855	3050	0.4747	32030016
0.2856	0.7984	3100	0.4740	32553288
0.2669	0.8112	3150	0.4687	33078960
0.3322	0.8241	3200	0.4703	33604328
0.2836	0.8370	3250	0.4657	34129832
0.3135	0.8499	3300	0.4714	34654480
0.3253	0.8627	3350	0.4715	35179104
0.3187	0.8756	3400	0.4702	35706848

Framework versions

PEFT 0.12.0
Transformers 4.47.0.dev0
Pytorch 2.5.1+cu121
Datasets 3.1.0
Tokenizers 0.20.3

ben81828
/

qwenvl-2B-cadica-stenosis-detect-scale4

qwenvl-2B-cadica-stenosis-detect-scale4

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ben81828/qwenvl-2B-cadica-stenosis-detect-scale4

Evaluation results