AdaDecode
Collection
9 items
•
Updated
This model is a fine-tuned version of meta-llama/CodeLlama-34b-Instruct-hf on the meng-lab/CodeLlama-34B-Instruct-xsum dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Loss Layer 6 Head | Loss Layer 12 Head | Loss Layer 18 Head | Loss Layer 24 Head | Loss Layer 30 Head | Loss Layer 36 Head | Loss Layer 42 Head |
---|---|---|---|---|---|---|---|---|---|---|
5.7055 | 2.56 | 200 | 6.7354 | 1.7854 | 1.4923 | 1.4206 | 0.8735 | 0.6246 | 0.4658 | 0.4023 |
4.2132 | 5.12 | 400 | 6.1546 | 1.7830 | 1.3957 | 1.1440 | 0.7581 | 0.6526 | 0.3538 | 0.2167 |
4.081 | 7.68 | 600 | 6.0643 | 1.6946 | 1.4230 | 1.1566 | 0.8413 | 0.5291 | 0.3589 | 0.2186 |
3.5585 | 10.24 | 800 | 5.8829 | 1.6599 | 1.3383 | 1.1385 | 0.7602 | 0.5903 | 0.3677 | 0.2221 |
3.5251 | 12.8 | 1000 | 5.7000 | 1.6490 | 1.2994 | 1.0979 | 0.7252 | 0.5119 | 0.3438 | 0.2164 |
3.1679 | 15.36 | 1200 | 5.6536 | 1.6224 | 1.2553 | 1.1685 | 0.7247 | 0.5292 | 0.3125 | 0.1873 |
3.2193 | 17.92 | 1400 | 5.5506 | 1.5900 | 1.2721 | 1.0925 | 0.7382 | 0.4849 | 0.3224 | 0.1969 |
3.0832 | 20.48 | 1600 | 5.5640 | 1.5978 | 1.2975 | 1.1012 | 0.7319 | 0.4884 | 0.3065 | 0.1891 |
2.9621 | 23.04 | 1800 | 5.5682 | 1.6054 | 1.2700 | 1.1180 | 0.7373 | 0.4615 | 0.2985 | 0.2074 |
3.0878 | 25.6 | 2000 | 5.7224 | 1.6020 | 1.4047 | 1.1298 | 0.7446 | 0.4841 | 0.3109 | 0.1890 |
2.8619 | 28.16 | 2200 | 5.5169 | 1.5917 | 1.2565 | 1.0982 | 0.7340 | 0.4624 | 0.3221 | 0.2038 |
2.9146 | 30.72 | 2400 | 5.4960 | 1.6334 | 1.2661 | 1.0884 | 0.7008 | 0.4590 | 0.3066 | 0.1775 |
2.8805 | 33.28 | 2600 | 5.7326 | 1.7120 | 1.2473 | 1.1268 | 0.8572 | 0.5254 | 0.3132 | 0.1889 |
2.8492 | 35.84 | 2800 | 5.5193 | 1.6050 | 1.2626 | 1.0868 | 0.7980 | 0.4569 | 0.2897 | 0.1967 |
2.7414 | 38.4 | 3000 | 5.5041 | 1.5895 | 1.2722 | 1.1454 | 0.6997 | 0.4646 | 0.2958 | 0.1719 |
2.8092 | 40.96 | 3200 | 5.4876 | 1.5899 | 1.2512 | 1.0805 | 0.7123 | 0.4602 | 0.3544 | 0.1739 |
2.5986 | 43.52 | 3400 | 5.4265 | 1.5933 | 1.2407 | 1.0890 | 0.6999 | 0.4719 | 0.2914 | 0.1743 |
2.5645 | 46.08 | 3600 | 5.4640 | 1.5893 | 1.2546 | 1.0868 | 0.7156 | 0.4573 | 0.3096 | 0.1809 |
2.6286 | 48.64 | 3800 | 5.4074 | 1.5805 | 1.2430 | 1.0898 | 0.6973 | 0.4577 | 0.2949 | 0.1757 |
2.5402 | 51.2 | 4000 | 5.4498 | 1.6051 | 1.2551 | 1.0857 | 0.7044 | 0.4704 | 0.2965 | 0.1833 |
2.6027 | 53.76 | 4200 | 5.5040 | 1.6330 | 1.2577 | 1.0813 | 0.7198 | 0.5051 | 0.3221 | 0.1834 |
2.4852 | 56.32 | 4400 | 5.4356 | 1.5925 | 1.2526 | 1.0858 | 0.7114 | 0.4580 | 0.2926 | 0.1861 |
2.4804 | 58.88 | 4600 | 5.4179 | 1.5895 | 1.2417 | 1.0782 | 0.7668 | 0.4488 | 0.2870 | 0.1708 |
2.4591 | 61.44 | 4800 | 5.3843 | 1.5925 | 1.2437 | 1.0750 | 0.6884 | 0.4509 | 0.2912 | 0.1708 |
2.4773 | 64.0 | 5000 | 5.4038 | 1.5952 | 1.2450 | 1.0797 | 0.6915 | 0.4486 | 0.2933 | 0.1994 |
2.4562 | 66.56 | 5200 | 5.3922 | 1.5918 | 1.2485 | 1.0776 | 0.6968 | 0.4479 | 0.2871 | 0.1696 |
2.3506 | 69.12 | 5400 | 5.3768 | 1.5882 | 1.2454 | 1.0791 | 0.6869 | 0.4474 | 0.2867 | 0.1710 |
2.4044 | 71.68 | 5600 | 5.3605 | 1.5856 | 1.2385 | 1.0739 | 0.6914 | 0.4472 | 0.2856 | 0.1700 |
2.3106 | 74.24 | 5800 | 5.4110 | 1.5956 | 1.2418 | 1.0776 | 0.6972 | 0.4813 | 0.2891 | 0.1908 |
2.3976 | 76.8 | 6000 | 5.3686 | 1.5894 | 1.2410 | 1.0754 | 0.6877 | 0.4455 | 0.2856 | 0.1685 |
2.2507 | 79.36 | 6200 | 5.3727 | 1.5923 | 1.2414 | 1.0760 | 0.6877 | 0.4455 | 0.2852 | 0.1701 |
2.3297 | 81.92 | 6400 | 5.3620 | 1.5871 | 1.2407 | 1.0748 | 0.6867 | 0.4443 | 0.2855 | 0.1686 |
2.2224 | 84.48 | 6600 | 5.3621 | 1.5881 | 1.2408 | 1.0751 | 0.6865 | 0.4444 | 0.2846 | 0.1687 |
2.2312 | 87.04 | 6800 | 5.3594 | 1.5863 | 1.2400 | 1.0735 | 0.6862 | 0.4446 | 0.2846 | 0.1689 |
2.2597 | 89.6 | 7000 | 5.3562 | 1.5858 | 1.2387 | 1.0732 | 0.6860 | 0.4440 | 0.2844 | 0.1684 |
2.201 | 92.16 | 7200 | 5.3562 | 1.5867 | 1.2387 | 1.0733 | 0.6861 | 0.4438 | 0.2842 | 0.1684 |
2.2423 | 94.72 | 7400 | 5.3539 | 1.5862 | 1.2380 | 1.0726 | 0.6856 | 0.4438 | 0.2842 | 0.1686 |
2.2145 | 97.28 | 7600 | 5.3546 | 1.5863 | 1.2384 | 1.0728 | 0.6857 | 0.4437 | 0.2842 | 0.1686 |
2.2007 | 99.84 | 7800 | 5.3547 | 1.5863 | 1.2384 | 1.0729 | 0.6857 | 0.4438 | 0.2842 | 0.1685 |
Base model
meta-llama/CodeLlama-34b-Instruct-hf