zgce commited on
Commit
1117552
·
verified ·
1 Parent(s): ad54932

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -15
README.md CHANGED
@@ -8,20 +8,6 @@ license: mit
8
 
9
  This model is a fine-tuned version of [/root/LLaMA-Factory/models/Qwen2.5-14B-Instruct-GPTQ-Int8](https://huggingface.co//root/LLaMA-Factory/models/Qwen2.5-14B-Instruct-GPTQ-Int8) on the airboros-31_en and the airboros-31_zh datasets.
10
 
11
- ## Model description
12
-
13
- More information needed
14
-
15
- ## Intended uses & limitations
16
-
17
- More information needed
18
-
19
- ## Training and evaluation data
20
-
21
- More information needed
22
-
23
- ## Training procedure
24
-
25
  ### Training hyperparameters
26
 
27
  The following hyperparameters were used during training:
@@ -37,7 +23,15 @@ The following hyperparameters were used during training:
37
  - mixed_precision_training: Native AMP
38
 
39
  ### Training results
40
-
 
 
 
 
 
 
 
 
41
 
42
 
43
  ### Framework versions
 
8
 
9
  This model is a fine-tuned version of [/root/LLaMA-Factory/models/Qwen2.5-14B-Instruct-GPTQ-Int8](https://huggingface.co//root/LLaMA-Factory/models/Qwen2.5-14B-Instruct-GPTQ-Int8) on the airboros-31_en and the airboros-31_zh datasets.
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ### Training hyperparameters
12
 
13
  The following hyperparameters were used during training:
 
23
  - mixed_precision_training: Native AMP
24
 
25
  ### Training results
26
+ {
27
+ "epoch": 0.9997864616698697,
28
+ "num_input_tokens_seen": 74083488,
29
+ "total_flos": 4.6864422704480256e+17,
30
+ "train_loss": 0.692321076499046,
31
+ "train_runtime": 65496.9949,
32
+ "train_samples_per_second": 1.144,
33
+ "train_steps_per_second": 0.036
34
+ }
35
 
36
 
37
  ### Framework versions