ziq
/

llama-guanaco-7b-ingbetic

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

ziq commited on Jun 12, 2023

Commit

5227a03

·

1 Parent(s): 0066bea

update model card README.md

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -5,7 +5,6 @@ tags:
 model-index:
 - name: llama-guanaco-7b-ingbetic
   results: []
-library_name: peft
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,12 +14,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [decapoda-research/llama-7b-hf](https://huggingface.co/decapoda-research/llama-7b-hf) on the None dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: 1.0377
-- eval_runtime: 65.5409
-- eval_samples_per_second: 27.952
-- eval_steps_per_second: 3.494
 - epoch: 0.12
-- step: 40
 ## Model description
@@ -43,8 +42,8 @@ The following hyperparameters were used during training:
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 16
-- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2

 model-index:
 - name: llama-guanaco-7b-ingbetic
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [decapoda-research/llama-7b-hf](https://huggingface.co/decapoda-research/llama-7b-hf) on the None dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: 1.0322
+- eval_runtime: 65.492
+- eval_samples_per_second: 27.973
+- eval_steps_per_second: 3.497
 - epoch: 0.12
+- step: 80
 ## Model description
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2