RylanSchaeffer
/

collapse_gemma-2-2b_hs2_replace_iter19_sftsd2

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_replace_iter19_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.6313
Num Input Tokens Seen: 4586296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 2
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.6265	0.0513	5	1.2809	240576
0.8464	0.1026	10	1.3184	478736
0.464	0.1538	15	1.5648	714752
0.1587	0.2051	20	1.7575	949952
0.1115	0.2564	25	1.9885	1188848
0.0459	0.3077	30	2.2789	1425720
0.0281	0.3590	35	2.4173	1661440
0.0227	0.4103	40	2.5453	1883136
0.0234	0.4615	45	2.6143	2119680
0.0251	0.5128	50	2.6521	2357408
0.0212	0.5641	55	2.6451	2594336
0.0217	0.6154	60	2.6474	2834280
0.02	0.6667	65	2.6435	3067200
0.0212	0.7179	70	2.6357	3305472
0.0208	0.7692	75	2.6311	3547016
0.0198	0.8205	80	2.6427	3785376
0.0212	0.8718	85	2.6364	4022648
0.0194	0.9231	90	2.6294	4260808
0.0206	0.9744	95	2.6309	4491424

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 6

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter19_sftsd2

Base model

google/gemma-2-2b

Finetuned

(511)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard