lordspline commited on
Commit
fd322cb
1 Parent(s): ad34de3

End of training

Browse files
Files changed (2) hide show
  1. README.md +13 -7
  2. pytorch_model.bin +1 -1
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: lordspline/qwen-merged
3
  tags:
4
  - axolotl
5
  - generated_from_trainer
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  axolotl version: `0.4.1`
18
  ```yaml
19
- base_model: lordspline/qwen-merged
20
  model_type: AutoModelForCausalLM
21
  tokenizer_type: AutoTokenizer
22
 
@@ -26,7 +26,13 @@ strict: false
26
 
27
  chat_template: chatml
28
  datasets:
29
- - path: lordspline/scidata
 
 
 
 
 
 
30
  type: sharegpt
31
  conversation: chatml
32
 
@@ -64,7 +70,7 @@ gradient_checkpointing: unsloth
64
  gradient_checkpointing_kwargs:
65
  use_reentrant: true # look
66
  early_stopping_patience:
67
- resume_from_checkpoint: ./mergestein/checkpoint-8015
68
  local_rank:
69
  logging_steps: 1
70
  xformers_attention:
@@ -93,9 +99,9 @@ tokens:
93
 
94
  # mergestein
95
 
96
- This model is a fine-tuned version of [lordspline/qwen-merged](https://huggingface.co/lordspline/qwen-merged) on the None dataset.
97
  It achieves the following results on the evaluation set:
98
- - Loss: 1.6175
99
 
100
  ## Model description
101
 
@@ -127,7 +133,7 @@ The following hyperparameters were used during training:
127
 
128
  | Training Loss | Epoch | Step | Validation Loss |
129
  |:-------------:|:-----:|:-----:|:---------------:|
130
- | 1.5213 | 1.0 | 22879 | 1.6175 |
131
 
132
 
133
  ### Framework versions
 
1
  ---
2
+ base_model: lordspline/mergestein
3
  tags:
4
  - axolotl
5
  - generated_from_trainer
 
16
 
17
  axolotl version: `0.4.1`
18
  ```yaml
19
+ base_model: lordspline/mergestein
20
  model_type: AutoModelForCausalLM
21
  tokenizer_type: AutoTokenizer
22
 
 
26
 
27
  chat_template: chatml
28
  datasets:
29
+ # - path: lordspline/scidata
30
+ # type: sharegpt
31
+ # conversation: chatml
32
+ - path: lordspline/wizard_v2_196k_unfiltered
33
+ type: sharegpt
34
+ conversation: chatml
35
+ - path: lordspline/ultrainteract
36
  type: sharegpt
37
  conversation: chatml
38
 
 
70
  gradient_checkpointing_kwargs:
71
  use_reentrant: true # look
72
  early_stopping_patience:
73
+ resume_from_checkpoint: # ./mergestein/checkpoint-8015
74
  local_rank:
75
  logging_steps: 1
76
  xformers_attention:
 
99
 
100
  # mergestein
101
 
102
+ This model is a fine-tuned version of [lordspline/mergestein](https://huggingface.co/lordspline/mergestein) on the None dataset.
103
  It achieves the following results on the evaluation set:
104
+ - Loss: 1.0348
105
 
106
  ## Model description
107
 
 
133
 
134
  | Training Loss | Epoch | Step | Validation Loss |
135
  |:-------------:|:-----:|:-----:|:---------------:|
136
+ | 1.1202 | 1.0 | 25552 | 1.0348 |
137
 
138
 
139
  ### Framework versions
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:82fee2a8c0ce5933cfda49995ef3a3064a42870c443b5dcd9006337b6594ea1a
3
  size 1589947346
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc89906f67ddbeb76efbb37239aa493a9b881a2577994f05671aea509adc0188
3
  size 1589947346