Meta-Llama-3.1-8B-Instruct-function-calling-json-mode-VisitorRequests_Lora

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.6890

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
2.2123	0.0630	1	2.0355
2.031	0.1260	2	1.8189
1.8617	0.1890	3	1.2382
1.2165	0.2520	4	1.2213
1.2384	0.3150	5	1.3884
1.2876	0.3780	6	1.3734
1.3752	0.4409	7	1.0046
0.9925	0.5039	8	1.1719
1.1438	0.5669	9	0.9010
0.9124	0.6299	10	0.8452
0.8283	0.6929	11	0.7755
0.762	0.7559	12	0.7758
0.7601	0.8189	13	0.8326
0.7841	0.8819	14	0.7731
0.697	0.9449	15	0.7534
0.7392	1.0079	16	0.7244
0.6977	1.0709	17	0.7054
0.6216	1.1339	18	0.6978
0.9607	1.1969	19	0.7370
0.693	1.2598	20	0.8337
0.8311	1.3228	21	0.9197
0.8475	1.3858	22	0.8201
0.7663	1.4488	23	0.7467
0.6859	1.5118	24	0.7316
0.6419	1.5748	25	0.7193
0.6363	1.6378	26	0.7011
0.6569	1.7008	27	0.7019
0.6467	1.7638	28	0.6921
0.6779	1.8268	29	0.6918
0.6638	1.8898	30	0.6890

Framework versions

PEFT 0.5.0
Transformers 4.44.0
Pytorch 2.1.0+cu118
Datasets 2.16.0
Tokenizers 0.19.1

mg11
/

Meta-Llama-3.1-8B-Instruct-function-calling-json-mode-VisitorRequests_Lora

Meta-Llama-3.1-8B-Instruct-function-calling-json-mode-VisitorRequests_Lora

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for mg11/Meta-Llama-3.1-8B-Instruct-function-calling-json-mode-VisitorRequests_Lora

Evaluation results