Undi95 commited on
Commit
4bf7b42
1 Parent(s): e0ae35a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -152
README.md CHANGED
@@ -1,167 +1,36 @@
1
  ---
2
- base_model: alpindale/Mistral-7B-v0.2-hf
3
  tags:
4
- - generated_from_trainer
5
- model-index:
6
- - name: out
7
- results: []
8
  ---
9
 
10
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
- should probably proofread and complete it, then remove this comment. -->
12
 
13
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
14
- <details><summary>See axolotl config</summary>
15
 
16
- axolotl version: `0.4.0`
17
- ```yaml
18
- base_model: alpindale/Mistral-7B-v0.2-hf
19
- model_type: MistralForCausalLM
20
- tokenizer_type: LlamaTokenizer
21
 
22
- load_in_8bit: false
23
- load_in_4bit: false
24
- strict: false
25
 
26
- datasets:
27
- - path: ./datasets/ToxicQAFinal.parquet
28
- type: sharegpt
29
- conversation: chatml
30
- - path: ./datasets/aesir-3-sfw_names-replaced.json
31
- type: sharegpt
32
- conversation: chatml
33
- - path: ./datasets/aesir-3-nsfw_names-replaced.json
34
- type: sharegpt
35
- conversation: chatml
36
- - path: ./datasets/aesir2_modified_sharegpt.json
37
- type: sharegpt
38
- conversation: chatml
39
- - path: ./datasets/aesir_modified_sharegpt.json
40
- type: sharegpt
41
- conversation: chatml
42
- - path: ./datasets/no-robots-sharegpt-fixed.jsonl
43
- type: sharegpt
44
- conversation: chatml
45
- - path: ./datasets/bluemoon.train.json
46
- type: sharegpt
47
- conversation: chatml
48
- - path: ./datasets/toxicsharegpt-NoWarning.jsonl
49
- type: sharegpt
50
- conversation: chatml
51
- - path: ./datasets/LimaRP-ShareGPT.json
52
- type: sharegpt
53
- conversation: chatml
54
- - path: ./datasets/CapybaraPure_Decontaminated-ShareGPT.json
55
- type: sharegpt
56
- conversation: chatml
57
- dataset_prepared_path:
58
- val_set_size: 0.05
59
- output_dir: ./out
60
 
61
- sequence_len: 8192
62
- sample_packing: true
63
- pad_to_sequence_len: true
64
- gradient_checkpointing_kwargs:
65
- use_reentrant: true
66
-
67
- wandb_project: MistralMaid-7B-0.2
68
- wandb_entity:
69
- wandb_watch:
70
- wandb_name:
71
- wandb_log_model:
72
-
73
- gradient_accumulation_steps: 1
74
- micro_batch_size: 3
75
- num_epochs: 2
76
- optimizer: adamw_bnb_8bit
77
- lr_scheduler: cosine
78
- learning_rate: 0.000005
79
-
80
- train_on_inputs: true
81
- group_by_length: false
82
- bf16: auto
83
- fp16:
84
- tf32: false
85
-
86
- gradient_checkpointing: true
87
- early_stopping_patience:
88
- resume_from_checkpoint:
89
- local_rank:
90
- logging_steps: 1
91
- xformers_attention:
92
- flash_attention: true
93
-
94
- warmup_steps: 10
95
- evals_per_epoch: 4
96
- eval_table_size:
97
- saves_per_epoch: 1
98
- debug:
99
- deepspeed:
100
- weight_decay: 0.0
101
- fsdp:
102
- fsdp_config:
103
- special_tokens:
104
- bos_token: "<s>"
105
- eos_token: "</s>"
106
- unk_token: "<unk>"
107
 
108
  ```
 
109
 
110
- </details><br>
111
-
112
- # out
113
-
114
- This model is a fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on the None dataset.
115
- It achieves the following results on the evaluation set:
116
- - Loss: 1.1414
117
-
118
- ## Model description
119
-
120
- More information needed
121
-
122
- ## Intended uses & limitations
123
 
124
- More information needed
 
125
 
126
- ## Training and evaluation data
127
-
128
- More information needed
129
-
130
- ## Training procedure
131
-
132
- ### Training hyperparameters
133
-
134
- The following hyperparameters were used during training:
135
- - learning_rate: 5e-06
136
- - train_batch_size: 3
137
- - eval_batch_size: 3
138
- - seed: 42
139
- - distributed_type: multi-GPU
140
- - num_devices: 2
141
- - total_train_batch_size: 6
142
- - total_eval_batch_size: 6
143
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
144
- - lr_scheduler_type: cosine
145
- - lr_scheduler_warmup_steps: 10
146
- - num_epochs: 2
147
-
148
- ### Training results
149
-
150
- | Training Loss | Epoch | Step | Validation Loss |
151
- |:-------------:|:-----:|:----:|:---------------:|
152
- | 1.4494 | 0.0 | 1 | 1.4125 |
153
- | 1.2942 | 0.25 | 296 | 1.1561 |
154
- | 1.3496 | 0.5 | 592 | 1.1433 |
155
- | 1.0723 | 0.75 | 888 | 1.1374 |
156
- | 1.3354 | 1.0 | 1184 | 1.1313 |
157
- | 0.9644 | 1.24 | 1480 | 1.1415 |
158
- | 1.1276 | 1.49 | 1776 | 1.1412 |
159
- | 0.9386 | 1.74 | 2072 | 1.1414 |
160
-
161
-
162
- ### Framework versions
163
 
164
- - Transformers 4.40.0.dev0
165
- - Pytorch 2.0.1+cu118
166
- - Datasets 2.18.0
167
- - Tokenizers 0.15.0
 
1
  ---
2
+ license: cc-by-nc-4.0
3
  tags:
4
+ - not-for-all-audiences
5
+ - nsfw
 
 
6
  ---
7
 
8
+ <!-- description start -->
9
+ ## Description
10
 
11
+ This repo contains fp16 files of LewdMistral-7B-0.2.
 
12
 
13
+ It's a full finetune (on 2 epoch) of [Mistral-7B-v0.2](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) based on multiple RP datasets.
 
 
 
 
14
 
15
+ It was made for being merged with old 0.1 model as an experiment to see if it would be possible to add new data from 0.2 into 0.1 finetunes, but since it's usable, I let is open for further train/merging.
 
 
16
 
17
+ It was used to create [BigL](https://huggingface.co/Undi95/BigL-7B), a model who take Mistral 0.2 7B as a base, but merged with Mistral 0.1 finetunes.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
+ <!-- description end -->
20
+ <!-- prompt-template start -->
21
+ ## Prompt template: Alpaca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ```
24
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
25
 
26
+ ### Instruction:
27
+ {system prompt}
 
 
 
 
 
 
 
 
 
 
 
28
 
29
+ ### Input:
30
+ {prompt}
31
 
32
+ ### Response:
33
+ {output}
34
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
+ If you want to support me, you can [here](https://ko-fi.com/undiai).