qwenvl-2B-cadica-stenosis-classify-scale4-frozenVision

This model is a fine-tuned version of AdaptLLM/biomed-Qwen2-VL-2B-Instruct on the CADICA狹窄分析選擇題scale4(TRAIN) dataset. It achieves the following results on the evaluation set:

Loss: 0.6319
Num Input Tokens Seen: 39760664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 6
total_train_batch_size: 24
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
training_steps: 3400

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.9157	0.0258	50	0.9192	584856
0.901	0.0515	100	0.9078	1169664
0.9032	0.0773	150	0.8962	1754512
0.9053	0.1030	200	0.8982	2339304
0.9098	0.1288	250	0.8956	2924016
0.8893	0.1545	300	0.8909	3508888
0.9075	0.1803	350	0.8920	4093688
0.9004	0.2060	400	0.9086	4678384
0.9076	0.2318	450	0.8962	5263304
0.8962	0.2575	500	0.8988	5848048
0.9013	0.2833	550	0.9001	6432936
0.9046	0.3090	600	0.9053	7017576
0.904	0.3348	650	0.9033	7602512
0.8972	0.3605	700	0.9029	8187320
0.9005	0.3863	750	0.8982	8772104
0.8881	0.4121	800	0.8973	9357016
0.9035	0.4378	850	0.8779	9941896
0.8961	0.4636	900	0.8914	10526712
0.8852	0.4893	950	0.8916	11111520
0.8635	0.5151	1000	0.8602	11696200
0.8844	0.5408	1050	0.8446	12281072
0.8427	0.5666	1100	0.7743	12865992
0.8185	0.5923	1150	0.7827	13450720
0.8061	0.6181	1200	0.7594	14035544
0.7917	0.6438	1250	0.7407	14620336
0.7724	0.6696	1300	0.7190	15205064
0.7278	0.6953	1350	0.7129	15789848
0.7359	0.7211	1400	0.6644	16374784
0.6291	0.7468	1450	0.7531	16959632
0.6021	0.7726	1500	0.6329	17544440
0.667	0.7984	1550	0.6618	18129304
0.6564	0.8241	1600	0.6319	18714072
0.5668	0.8499	1650	0.6635	19298848
0.5701	0.8756	1700	0.7144	19883504
0.546	0.9014	1750	0.6723	20468200
0.412	0.9271	1800	0.6769	21053080
0.4347	0.9529	1850	0.6808	21637848
0.3737	0.9786	1900	0.7730	22222632
0.3783	1.0041	1950	0.6983	22801512
0.3328	1.0299	2000	0.7485	23386232
0.3602	1.0556	2050	0.7191	23971048
0.3351	1.0814	2100	0.8075	24555904
0.3699	1.1071	2150	0.8524	25140752
0.4016	1.1329	2200	0.7535	25725560
0.3442	1.1586	2250	0.7066	26310336
0.3877	1.1844	2300	0.7277	26895096
0.3871	1.2101	2350	0.7660	27479840
0.3486	1.2359	2400	0.7411	28064552
0.2966	1.2617	2450	0.7486	28649256
0.3221	1.2874	2500	0.7222	29233968
0.3231	1.3132	2550	0.7146	29818856
0.2779	1.3389	2600	0.6957	30403640
0.2962	1.3647	2650	0.7657	30988344
0.3163	1.3904	2700	0.7473	31573240
0.164	1.4162	2750	0.7807	32158144
0.2939	1.4419	2800	0.7913	32743032
0.2848	1.4677	2850	0.8045	33327720
0.29	1.4934	2900	0.8113	33912440
0.2494	1.5192	2950	0.8177	34497216
0.2259	1.5449	3000	0.8406	35082056
0.2851	1.5707	3050	0.8474	35666976
0.2351	1.5964	3100	0.8651	36251744
0.2638	1.6222	3150	0.8634	36836560
0.312	1.6480	3200	0.8680	37421416
0.2785	1.6737	3250	0.8640	38006200
0.2752	1.6995	3300	0.8644	38591128
0.2674	1.7252	3350	0.8666	39175888
0.1797	1.7510	3400	0.8603	39760664

Framework versions

PEFT 0.12.0
Transformers 4.47.0.dev0
Pytorch 2.5.1+cu121
Datasets 3.1.0
Tokenizers 0.20.3

ben81828
/

qwenvl-2B-cadica-stenosis-classify-scale4-frozenVision

qwenvl-2B-cadica-stenosis-classify-scale4-frozenVision

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ben81828/qwenvl-2B-cadica-stenosis-classify-scale4-frozenVision

Evaluation results