--- base_model: Qwen/Qwen2-1.5B datasets: - macadeliccc/opus_samantha - teknium/OpenHermes-2.5 - cognitivecomputations/samantha-data - cognitivecomputations/samantha-1.5 - jondurbin/airoboros-3.2 - microsoft/orca-math-word-problems-200k - Sao10K/Claude-3-Opus-Instruct-15K - Locutusque/function-calling-chatml - Migtissera/Hitchhikers --- # Samantha Qwen2 1.5B This model was trained on 2xL40S using FSDP and QLoRa. FP16 Merge is available [here](https://huggingface.co/macadeliccc/Samantha-Qwen2-1.5B) ## Prompt Template ``` <|im_start|>system You are a helpful AI assistant<|im_end|> <|im_start|>user What is the capital of France?<|im_end|> <|im_start|>assistant ``` ## Launch Using VLLM ```bash python -m vllm.entrypoints.openai.api_server \ --model macadeliccc/Samantha-Qwen2-1.5B \ --chat-template ./examples/template_chatml.jinja \ ``` ```python from openai import OpenAI # Set OpenAI's API key and API base to use vLLM's API server. openai_api_key = "EMPTY" openai_api_base = "http://localhost:8000/v1" client = OpenAI( api_key=openai_api_key, base_url=openai_api_base, ) chat_response = client.chat.completions.create( model="macadeliccc/Samantha-Qwen-2-1.5B", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me a joke."}, ] ) print("Chat response:", chat_response) ``` ## Quants TODO ## Config [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
See axolotl config axolotl version: `0.4.0` ```yaml base_model: Qwen/Qwen2-1.5B trust_remote_code: true load_in_8bit: false load_in_4bit: true strict: false datasets: - path: macadeliccc/opus_samantha type: sharegpt field: conversations conversation: chatml - path: json data_files: uncensored_ultrachat_20k_sharegpt.json type: sharegpt field: conversations conversation: chatml - path: json data_files: flattened_openhermes_200k.json type: sharegpt field: conversations conversation: chatml - path: json data_files: opus_instruct.json type: sharegpt field: conversations conversation: chatml - path: json data_files: airoboros_uncensored.json type: sharegpt field: conversations conversation: chatml - path: json data_files: orca_math_word_problems_sharegpt.json type: sharegpt field: conversations conversation: chatml - path: json data_files: sharegpt_starcoder.json type: sharegpt field: conversations conversation: chatml - path: json data_files: samantha_1.1_uncensored.json type: sharegpt field: conversations conversation: chatml - path: json data_files: samantha_1.5.json type: sharegpt field: conversations conversation: chatml - path: json data_files: sharegpt_hitchhikers_v1.json type: sharegpt field: conversations conversation: chatml chat_template: chatml dataset_prepared_path: val_set_size: 0.05 output_dir: ./outputs/out sequence_len: 4096 sample_packing: true eval_sample_packing: true pad_to_sequence_len: true adapter: qlora lora_model_dir: lora_r: 32 lora_alpha: 64 lora_dropout: 0.05 lora_target_linear: true lora_fan_in_fan_out: wandb_project: wandb_entity: wandb_watch: wandb_name: wandb_log_model: gradient_accumulation_steps: 4 micro_batch_size: 1 num_epochs: 3 optimizer: adamw_torch lr_scheduler: cosine learning_rate: 0.0002 train_on_inputs: false group_by_length: false bf16: auto fp16: tf32: true gradient_checkpointing: true gradient_checkpointing_kwargs: use_reentrant: false early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true warmup_steps: 10 evals_per_epoch: 4 saves_per_epoch: 1 debug: deepspeed: weight_decay: 0.0 fsdp: - full_shard - auto_wrap fsdp_config: fsdp_limit_all_gathers: true fsdp_sync_module_states: true fsdp_offload_params: true fsdp_use_orig_params: false fsdp_cpu_ram_efficient_loading: true fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP fsdp_transformer_layer_cls_to_wrap: Qwen2DecoderLayer fsdp_state_dict_type: FULL_STATE_DICT special_tokens: ```