qwenvl-2B-cadica-stenosis-classify-scale4

This model is a fine-tuned version of AdaptLLM/biomed-Qwen2-VL-2B-Instruct on the CADICA狹窄分析選擇題scale4(TRAIN) dataset. It achieves the following results on the evaluation set:

Loss: 0.1878
Num Input Tokens Seen: 39772368

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 6
total_train_batch_size: 24
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
training_steps: 3400

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.9233	0.0258	50	0.9282	584856
0.9024	0.0515	100	0.9114	1169664
0.9045	0.0773	150	0.8935	1754512
0.904	0.1030	200	0.8980	2339304
0.9106	0.1288	250	0.8958	2924016
0.8901	0.1545	300	0.8932	3508888
0.9059	0.1803	350	0.8960	4093688
0.9033	0.2060	400	0.9063	4678384
0.9062	0.2318	450	0.9008	5263304
0.8576	0.2575	500	0.8269	5848048
0.8666	0.2833	550	0.7910	6432936
0.7997	0.3090	600	0.7877	7017576
0.8367	0.3348	650	0.7941	7602512
0.774	0.3605	700	0.7319	8187320
0.6751	0.3863	750	0.7322	8772104
0.6911	0.4121	800	0.7180	9357016
0.7455	0.4378	850	0.7039	9941896
0.7378	0.4636	900	0.7198	10526712
0.6825	0.4893	950	0.6831	11111520
0.5971	0.5151	1000	0.7079	11696200
0.6914	0.5408	1050	0.6824	12281072
0.5825	0.5666	1100	0.6432	12865992
0.5228	0.5923	1150	0.6230	13450720
0.5078	0.6181	1200	0.6184	14035544
0.5268	0.6438	1250	0.5497	14620336
0.4578	0.6696	1300	0.4947	15205064
0.4702	0.6953	1350	0.5248	15789848
0.4294	0.7211	1400	0.4732	16374784
0.4353	0.7468	1450	0.4350	16959632
0.3369	0.7726	1500	0.3964	17544440
0.4666	0.7984	1550	0.4266	18129304
0.3834	0.8241	1600	0.4477	18714072
0.475	0.8499	1650	0.3513	19298848
0.3752	0.8756	1700	0.3438	19883504
0.3233	0.9014	1750	0.3325	20468200
0.3279	0.9271	1800	0.3502	21053080
0.3221	0.9529	1850	0.2935	21637848
0.3781	0.9786	1900	0.2973	22222632
0.2845	1.0041	1950	0.2473	22801512
0.2272	1.0299	2000	0.2834	23386232
0.2924	1.0556	2050	0.2704	23971048
0.2805	1.0814	2100	0.3205	24555904
0.2536	1.1071	2150	0.3081	25140752
0.3184	1.1329	2200	0.2492	25725560
0.273	1.1586	2250	0.2201	26310336
0.2903	1.1844	2300	0.2940	26895096
0.2757	1.2101	2350	0.2621	27479840
0.2766	1.2359	2400	0.2361	28064552
0.3076	1.2617	2450	0.2372	28649256
0.257	1.2874	2500	0.2489	29233968
0.2192	1.3132	2550	0.2432	29818856
0.224	1.3389	2600	0.2026	30403640
0.2377	1.3647	2650	0.1878	30988344
0.2269	1.3904	2700	0.2400	31573240
0.1416	1.4162	2750	0.2472	32158144
0.2162	1.4419	2800	0.2771	32743032
0.1912	1.4677	2850	0.2647	33327720
0.2015	1.4934	2900	0.2392	33912440
0.2069	1.5192	2950	0.2639	34497216
0.2027	1.5449	3000	0.2371	35082056
0.1925	1.5707	3050	0.2484	35666976
0.2139	1.5964	3100	0.2747	36251744
0.204	1.6222	3150	0.2423	36836560
0.1851	1.6480	3200	0.2286	37421416
0.2072	1.6737	3250	0.2406	38006200
0.2145	1.6995	3300	0.2692	38591128
0.2158	1.7252	3350	0.2447	39175888
0.1488	1.7510	3400	0.2225	39760664

Framework versions

PEFT 0.12.0
Transformers 4.47.0.dev0
Pytorch 2.5.1+cu121
Datasets 3.1.0
Tokenizers 0.20.3

ben81828
/

CADICA_qwenvl_stenosis_classify_scale4

qwenvl-2B-cadica-stenosis-classify-scale4

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ben81828/CADICA_qwenvl_stenosis_classify_scale4

Evaluation results