Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.5718

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.25e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
lr_scheduler_warmup_steps: 15
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss
1.6025	2.1053	10	1.3983
1.2594	4.2105	20	1.1677
1.1238	6.3158	30	1.0695
1.0331	8.4211	40	0.9917
0.9668	10.5263	50	0.9300
0.9064	12.6316	60	0.8783
0.8569	14.7368	70	0.8309
0.8099	16.8421	80	0.7842
0.7632	18.9474	90	0.7365
0.7188	21.0526	100	0.6991
0.6855	23.1579	110	0.6714
0.6587	25.2632	120	0.6492
0.6383	27.3684	130	0.6312
0.6206	29.4737	140	0.6171
0.6077	31.5789	150	0.6062
0.5964	33.6842	160	0.5973
0.5881	35.7895	170	0.5898
0.5805	37.8947	180	0.5831
0.5732	40.0	190	0.5771
0.5665	42.1053	200	0.5718

Framework versions

PEFT 0.11.1
Transformers 4.42.3
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

frankmorales2020
/

Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for frankmorales2020/Mistral-7B-text-to-sql-flash-attention-2-FAISS-NEWPOC

Evaluation results