qwenvl-2B-cadica-direction-then-detect-and-classify-scale6
This model is a fine-tuned version of ben81828/CADICA_qwenvl_direction on the CADICA狹窄分析選擇題scale6(TRAIN) and the CADICA狹窄分析千問定位但不分類題scale6(TRAIN) datasets. It achieves the following results on the evaluation set:
- Loss: 0.1728
- Num Input Tokens Seen: 35316128
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 6
- total_train_batch_size: 24
- total_eval_batch_size: 4
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.05
- training_steps: 3400
Training results
Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
---|---|---|---|---|
1.3918 | 0.0148 | 50 | 1.0422 | 516240 |
0.8208 | 0.0295 | 100 | 0.8917 | 1030696 |
0.8125 | 0.0443 | 150 | 0.9009 | 1550792 |
0.7675 | 0.0591 | 200 | 0.9007 | 2071176 |
0.7558 | 0.0739 | 250 | 0.8108 | 2587272 |
0.78 | 0.0886 | 300 | 0.8194 | 3107200 |
0.6602 | 0.1034 | 350 | 0.7663 | 3625752 |
0.6739 | 0.1182 | 400 | 0.7039 | 4142592 |
0.5661 | 0.1329 | 450 | 0.7133 | 4663320 |
0.6283 | 0.1477 | 500 | 0.6505 | 5183664 |
0.5957 | 0.1625 | 550 | 0.6883 | 5703016 |
0.6331 | 0.1773 | 600 | 0.5883 | 6222736 |
0.5483 | 0.1920 | 650 | 0.6101 | 6743120 |
0.477 | 0.2068 | 700 | 0.5884 | 7262832 |
0.514 | 0.2216 | 750 | 0.4666 | 7779872 |
0.4239 | 0.2363 | 800 | 0.4822 | 8301976 |
0.4949 | 0.2511 | 850 | 0.6122 | 8822832 |
0.4852 | 0.2659 | 900 | 0.5606 | 9345160 |
0.4737 | 0.2806 | 950 | 0.4791 | 9863168 |
0.4005 | 0.2954 | 1000 | 0.5501 | 10379136 |
0.3991 | 0.3102 | 1050 | 0.4378 | 10897528 |
0.4624 | 0.3250 | 1100 | 0.5301 | 11413120 |
0.4432 | 0.3397 | 1150 | 0.4249 | 11933632 |
0.3296 | 0.3545 | 1200 | 0.2966 | 12456040 |
0.335 | 0.3693 | 1250 | 0.3185 | 12972696 |
0.3594 | 0.3840 | 1300 | 0.4716 | 13493264 |
0.3731 | 0.3988 | 1350 | 0.5566 | 14014736 |
0.388 | 0.4136 | 1400 | 0.3866 | 14532288 |
0.3131 | 0.4284 | 1450 | 0.4740 | 15050992 |
0.2928 | 0.4431 | 1500 | 0.4049 | 15572048 |
0.3588 | 0.4579 | 1550 | 0.2871 | 16091960 |
0.3879 | 0.4727 | 1600 | 0.3136 | 16609960 |
0.2698 | 0.4874 | 1650 | 0.4020 | 17130896 |
0.3904 | 0.5022 | 1700 | 0.3297 | 17650984 |
0.3173 | 0.5170 | 1750 | 0.4491 | 18169344 |
0.3127 | 0.5318 | 1800 | 0.3499 | 18691928 |
0.2828 | 0.5465 | 1850 | 0.3781 | 19212992 |
0.306 | 0.5613 | 1900 | 0.3766 | 19735976 |
0.2992 | 0.5761 | 1950 | 0.3468 | 20253288 |
0.2341 | 0.5908 | 2000 | 0.3366 | 20770728 |
0.2931 | 0.6056 | 2050 | 0.3386 | 21291664 |
0.1826 | 0.6204 | 2100 | 0.5386 | 21813984 |
0.2387 | 0.6352 | 2150 | 0.2581 | 22332144 |
0.2662 | 0.6499 | 2200 | 0.4840 | 22849552 |
0.2332 | 0.6647 | 2250 | 0.4966 | 23366784 |
0.2481 | 0.6795 | 2300 | 0.2418 | 23883032 |
0.2313 | 0.6942 | 2350 | 0.1870 | 24401256 |
0.262 | 0.7090 | 2400 | 0.3471 | 24921872 |
0.2412 | 0.7238 | 2450 | 0.3456 | 25439896 |
0.2382 | 0.7386 | 2500 | 0.2543 | 25961056 |
0.2364 | 0.7533 | 2550 | 0.3871 | 26477208 |
0.2082 | 0.7681 | 2600 | 0.3406 | 26997904 |
0.1736 | 0.7829 | 2650 | 0.2697 | 27521088 |
0.2225 | 0.7976 | 2700 | 0.4155 | 28042992 |
0.2501 | 0.8124 | 2750 | 0.4115 | 28561248 |
0.2507 | 0.8272 | 2800 | 0.3223 | 29079576 |
0.1928 | 0.8419 | 2850 | 0.2828 | 29600536 |
0.2029 | 0.8567 | 2900 | 0.3943 | 30118072 |
0.1692 | 0.8715 | 2950 | 0.2034 | 30637448 |
0.234 | 0.8863 | 3000 | 0.2556 | 31159736 |
0.2303 | 0.9010 | 3050 | 0.2253 | 31679080 |
0.1999 | 0.9158 | 3100 | 0.2710 | 32196176 |
0.2069 | 0.9306 | 3150 | 0.2029 | 32713824 |
0.2135 | 0.9453 | 3200 | 0.3564 | 33235872 |
0.1964 | 0.9601 | 3250 | 0.3081 | 33752488 |
0.2131 | 0.9749 | 3300 | 0.3541 | 34269496 |
0.1779 | 0.9897 | 3350 | 0.2255 | 34784784 |
0.2173 | 1.0044 | 3400 | 0.4078 | 35305984 |
Framework versions
- PEFT 0.12.0
- Transformers 4.47.0.dev0
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 0
Model tree for ben81828/cadica-qwenvl-direction-then-detect-classify-scale6
Base model
Qwen/Qwen2-VL-2B
Finetuned
Qwen/Qwen2-VL-2B-Instruct
Finetuned
AdaptLLM/biomed-Qwen2-VL-2B-Instruct
Adapter
ben81828/CADICA_qwenvl_direction