macadeliccc commited on
Commit
8fdcfb8
1 Parent(s): 22904d6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +199 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Qwen/Qwen2-1.5B
3
+ datasets:
4
+ - macadeliccc/opus_samantha
5
+ - teknium/OpenHermes-2.5
6
+ - cognitivecomputations/samantha-data
7
+ - cognitivecomputations/samantha-1.5
8
+ - jondurbin/airoboros-3.2
9
+ - microsoft/orca-math-word-problems-200k
10
+ - Sao10K/Claude-3-Opus-Instruct-15K
11
+ - Locutusque/function-calling-chatml
12
+ - Migtissera/Hitchhikers
13
+ ---
14
+ # Samantha Qwen2 1.5B
15
+
16
+ This model was trained on 2xL40S using FSDP and QLoRa. Adapter is available [here](https://huggingface.co/macadeliccc/Samantha-Qwen2-1.5B-QLoRa).
17
+
18
+ ## Prompt Template
19
+
20
+ ```
21
+ <|im_start|>system
22
+ You are a helpful AI assistant<|im_end|>
23
+ <|im_start|>user
24
+ What is the capital of France?<|im_end|>
25
+ <|im_start|>assistant
26
+ ```
27
+
28
+ ## Launch Using VLLM
29
+
30
+ ```bash
31
+ python -m vllm.entrypoints.openai.api_server \
32
+ --model macadeliccc/Samantha-Qwen2-1.5B \
33
+ --chat-template ./examples/template_chatml.jinja \
34
+ ```
35
+
36
+ ```python
37
+ from openai import OpenAI
38
+ # Set OpenAI's API key and API base to use vLLM's API server.
39
+ openai_api_key = "EMPTY"
40
+ openai_api_base = "http://localhost:8000/v1"
41
+
42
+ client = OpenAI(
43
+ api_key=openai_api_key,
44
+ base_url=openai_api_base,
45
+ )
46
+
47
+ chat_response = client.chat.completions.create(
48
+ model="macadeliccc/Samantha-Qwen-2-7B",
49
+ messages=[
50
+ {"role": "system", "content": "You are a helpful assistant."},
51
+ {"role": "user", "content": "Tell me a joke."},
52
+ ]
53
+ )
54
+ print("Chat response:", chat_response)
55
+ ```
56
+
57
+ ## Quants
58
+
59
+ TODO
60
+
61
+ ## Config
62
+
63
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
64
+ <details><summary>See axolotl config</summary>
65
+
66
+ axolotl version: `0.4.0`
67
+ ```yaml
68
+ base_model: Qwen/Qwen2-1.5B
69
+ trust_remote_code: true
70
+
71
+ load_in_8bit: false
72
+ load_in_4bit: true
73
+ strict: false
74
+
75
+ datasets:
76
+ - path: macadeliccc/opus_samantha
77
+ type: sharegpt
78
+ field: conversations
79
+ conversation: chatml
80
+ - path: json
81
+ data_files: uncensored_ultrachat_20k_sharegpt.json
82
+ type: sharegpt
83
+ field: conversations
84
+ conversation: chatml
85
+ - path: json
86
+ data_files: flattened_openhermes_200k.json
87
+ type: sharegpt
88
+ field: conversations
89
+ conversation: chatml
90
+ - path: json
91
+ data_files: opus_instruct.json
92
+ type: sharegpt
93
+ field: conversations
94
+ conversation: chatml
95
+ - path: json
96
+ data_files: airoboros_uncensored.json
97
+ type: sharegpt
98
+ field: conversations
99
+ conversation: chatml
100
+ - path: json
101
+ data_files: orca_math_word_problems_sharegpt.json
102
+ type: sharegpt
103
+ field: conversations
104
+ conversation: chatml
105
+ - path: json
106
+ data_files: sharegpt_starcoder.json
107
+ type: sharegpt
108
+ field: conversations
109
+ conversation: chatml
110
+ - path: json
111
+ data_files: samantha_1.1_uncensored.json
112
+ type: sharegpt
113
+ field: conversations
114
+ conversation: chatml
115
+ - path: json
116
+ data_files: samantha_1.5.json
117
+ type: sharegpt
118
+ field: conversations
119
+ conversation: chatml
120
+ - path: json
121
+ data_files: sharegpt_hitchhikers_v1.json
122
+ type: sharegpt
123
+ field: conversations
124
+ conversation: chatml
125
+
126
+
127
+ chat_template: chatml
128
+
129
+
130
+ dataset_prepared_path:
131
+ val_set_size: 0.05
132
+ output_dir: ./outputs/out
133
+
134
+ sequence_len: 4096
135
+ sample_packing: true
136
+ eval_sample_packing: true
137
+ pad_to_sequence_len: true
138
+
139
+ adapter: qlora
140
+ lora_model_dir:
141
+ lora_r: 32
142
+ lora_alpha: 64
143
+ lora_dropout: 0.05
144
+ lora_target_linear: true
145
+ lora_fan_in_fan_out:
146
+
147
+ wandb_project:
148
+ wandb_entity:
149
+ wandb_watch:
150
+ wandb_name:
151
+ wandb_log_model:
152
+
153
+ gradient_accumulation_steps: 4
154
+ micro_batch_size: 1
155
+ num_epochs: 3
156
+ optimizer: adamw_torch
157
+ lr_scheduler: cosine
158
+ learning_rate: 0.0002
159
+
160
+ train_on_inputs: false
161
+ group_by_length: false
162
+ bf16: auto
163
+ fp16:
164
+ tf32: true
165
+
166
+ gradient_checkpointing: true
167
+ gradient_checkpointing_kwargs:
168
+ use_reentrant: false
169
+ early_stopping_patience:
170
+ resume_from_checkpoint:
171
+ local_rank:
172
+ logging_steps: 1
173
+ xformers_attention:
174
+ flash_attention: true
175
+
176
+ warmup_steps: 10
177
+ evals_per_epoch: 4
178
+ saves_per_epoch: 1
179
+ debug:
180
+ deepspeed:
181
+ weight_decay: 0.0
182
+ fsdp:
183
+ - full_shard
184
+ - auto_wrap
185
+ fsdp_config:
186
+ fsdp_limit_all_gathers: true
187
+ fsdp_sync_module_states: true
188
+ fsdp_offload_params: true
189
+ fsdp_use_orig_params: false
190
+ fsdp_cpu_ram_efficient_loading: true
191
+ fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
192
+ fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer
193
+ fsdp_state_dict_type: FULL_STATE_DICT
194
+ special_tokens:
195
+ ```
196
+
197
+ </details><br>
198
+
199
+