NXAIR_M_mistral-7B

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.8751

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.00025
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 15

Training results

Training Loss	Epoch	Step	Validation Loss
1.0718	0.16	100	1.0160
0.9627	0.32	200	0.9874
0.8621	0.48	300	0.8852
0.8674	0.63	400	0.8725
0.8039	0.79	500	0.8270
0.7757	0.95	600	0.8043
0.5737	1.11	700	0.7233
0.6043	1.27	800	0.7233
0.5896	1.43	900	0.7176
0.5701	1.58	1000	0.7050
0.5474	1.74	1100	0.7020
0.5622	1.9	1200	0.6686
0.4321	2.06	1300	0.7203
0.4063	2.22	1400	0.7155
0.4318	2.38	1500	0.7143
0.4375	2.54	1600	0.7128
0.4377	2.69	1700	0.6971
0.4364	2.85	1800	0.7102
0.4224	3.01	1900	0.6962
0.3352	3.17	2000	0.7134
0.3973	3.33	2100	0.7228
0.3907	3.49	2200	0.7293
0.3843	3.65	2300	0.7406
0.3972	3.8	2400	0.7381
0.4118	3.96	2500	0.7100
0.3011	4.12	2600	0.7390
0.3211	4.28	2700	0.7564
0.3228	4.44	2800	0.7676
0.3051	4.6	2900	0.7419
0.3272	4.75	3000	0.7520
0.3758	4.91	3100	0.7169
0.2952	5.07	3200	0.8331
0.3521	5.23	3300	0.7892
0.3582	5.39	3400	0.8023
0.3583	5.55	3500	0.7672
0.38	5.71	3600	0.7964
0.3735	5.86	3700	0.7602
0.3332	6.02	3800	0.8012
0.2981	6.18	3900	0.8070
0.3074	6.34	4000	0.7881
0.3579	6.5	4100	0.7447
0.3639	6.66	4200	0.7517
0.3481	6.81	4300	0.7815
0.3784	6.97	4400	0.7393
0.2917	7.13	4500	0.7802
0.2979	7.29	4600	0.7772
0.3005	7.45	4700	0.8432
0.3142	7.61	4800	0.8144
0.3468	7.77	4900	0.7675
0.3559	7.92	5000	0.7737
0.3028	8.08	5100	0.8472
0.3284	8.24	5200	0.8341
0.3123	8.4	5300	0.8470
0.3408	8.56	5400	0.7995
0.3283	8.72	5500	0.8048
0.3483	8.87	5600	0.8527
0.281	9.03	5700	0.8267
0.2738	9.19	5800	0.8195
0.3095	9.35	5900	0.8311
0.2954	9.51	6000	0.8241
0.309	9.67	6100	0.7944
0.3125	9.83	6200	0.8135
0.3339	9.98	6300	0.8094
0.3295	10.14	6400	0.8286
0.341	10.3	6500	0.8858
0.3157	10.46	6600	0.8527
0.3264	10.62	6700	0.8476
0.3631	10.78	6800	0.8255
0.3428	10.94	6900	0.8423
0.2963	11.09	7000	0.8148
0.3594	11.25	7100	0.8159
0.3309	11.41	7200	0.8058
0.3535	11.57	7300	0.8440
0.3679	11.73	7400	0.8273
0.3684	11.89	7500	0.7772
0.2645	12.04	7600	0.8764
0.3003	12.2	7700	0.8540
0.3225	12.36	7800	0.8711
0.3479	12.52	7900	0.8292
0.3414	12.68	8000	0.8558
0.3338	12.84	8100	0.8511
0.3569	13.0	8200	0.8418
0.3182	13.15	8300	0.8521
0.3119	13.31	8400	0.9313
0.3432	13.47	8500	0.8739
0.3366	13.63	8600	0.8637
0.3639	13.79	8700	0.8404
0.3764	13.95	8800	0.8386
0.2987	14.1	8900	0.8915
0.3061	14.26	9000	0.8548
0.3217	14.42	9100	0.8387
0.3166	14.58	9200	0.8253
0.3369	14.74	9300	0.8607
0.3461	14.9	9400	0.8751

Framework versions

PEFT 0.8.2
Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.17.0
Tokenizers 0.15.2

codewizardUV
/

NXAIR_M_mistral-7B_old_version

NXAIR_M_mistral-7B

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for codewizardUV/NXAIR_M_mistral-7B_old_version

Evaluation results