--- license: apache-2.0 datasets: - ehartford/dolphin - LinhDuong/chatdoctor-200k - sahil2801/code_instructions_120k - medalpaca/medical_meadow_mediqa - kaiokendev/SuperCOT-dataset - tiiuae/falcon-refinedweb - bigcode/starcoderdata - togethercomputer/RedPajama-Data-1T language: - en library_name: transformers pipeline_tag: text-generation tags: - medical - code --- # Model Card for Model ID This model is an instruction-tuned Open LLaMa model with 7B parameters, with specialities in medical QA and code instruction. ## Model Details - **Model type:** LlamaForCausalLM - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model (QLoRA):** [openlm-research/open_llama_7b_v2](https://huggingface.co/openlm-research/open_llama_7b_v2) ## How to Get Started with the Model Use the code below to get started with the model. ```py import torch from transformers import LlamaTokenizer, LlamaForCausalLM model_path = 'yhyhy3/open_llama_7b_v2_med_dolphin_qlora_merged' tokenizer = LlamaTokenizer.from_pretrained(model_path) model = LlamaForCausalLM.from_pretrained( model_path, torch_dtype=torch.float16, device_map='auto', ) prompt = '''### Instruction: Answer the following question. ### Input: What is the capital of New Jersey? ### Response:''' input_ids = tokenizer(prompt, return_tensors="pt").input_ids generation_output = model.generate( input_ids=input_ids, max_new_tokens=32 ) print(tokenizer.decode(generation_output[0])) ``` ## Training Details ### Training Data Converted the following datasets to alpaca:instruction format. 1. [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) - ORCA style dataset generously created by [Eric Hartford](https://huggingface.co/ehartford) - Only used the 1 million GPT4 generated instructions file [flan1m-alpaca-uncensored.jsonl](https://huggingface.co/datasets/ehartford/dolphin/blob/main/flan1m-alpaca-uncensored.jsonl). 2. [LinhDuong/chatdoctor-200k](https://huggingface.co/datasets/LinhDuong/chatdoctor-200k) - Refined dataset sourced from icliniq medical QA forum 3. [sahil2801/code_instructions_120k](https://huggingface.co/datasets/sahil2801/code_instructions_120k) - Code instruction dataset generously created by Sahil Chaudhary from ThreeSixty AI 4. [medalpaca/medical_meadow_mediqa](https://huggingface.co/datasets/medalpaca/medical_meadow_mediqa) - MEDIQA is a dataset of manually generated, question-driven summaries of multi and single document answers to consumer health questions from medalpaca group. 5. [kaiokendev/SuperCOT-dataset](https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset) - Code instruction dataset generously created by Kaio Ken ### Training Procedure Trained using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) QLoRa on [RunPod](https://www.runpod.io/console/gpu-cloud) 8x A6000 on Community Cloud for 3 epochs (~14 hours - ~$70).
axolotl training config: ```yaml base_model: openlm-research/open_llama_7b_v2 base_model_config: openlm-research/open_llama_7b_v2 model_type: LlamaForCausalLM tokenizer_type: LlamaTokenizer load_in_8bit: false load_in_4bit: true strict: false push_dataset_to_hub: hub_model_id: hf_use_auth_token: datasets: - path: json type: alpaca data_files: /disk/flan1m-alpaca-uncensored.jsonl shards: 8 - path: sahil2801/code_instructions_120k type: alpaca - path: LinhDuong/chatdoctor-200k type: alpaca shards: 2 - path: kaiokendev/SuperCOT-dataset type: alpaca - path: medalpaca/medical_meadow_mediqa type: alpaca dataset_prepared_path: last_run_prepared val_set_size: 0.01 adapter: qlora lora_model_dir: sequence_len: 2048 max_packed_sequence_len: 2048 lora_r: 8 lora_alpha: 32 lora_dropout: 0.05 lora_target_modules: lora_target_linear: true lora_fan_in_fan_out: wandb_mode: true wandb_project: wandb_watch: wandb_run_id: wandb_log_model: 'openllama_checkpoint' output_dir: /disk/open_llama_7b_v2_dolphin_qlora gradient_accumulation_steps: 2 micro_batch_size: 16 num_epochs: 3 optimizer: paged_adamw_32bit torchdistx_path: lr_scheduler: cosine learning_rate: 0.0002 train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: true gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: true flash_attention: gptq_groupsize: gptq_model_v1: warmup_steps: 1000 eval_steps: 5000 save_steps: debug: deepspeed: weight_decay: 0.0000001 fsdp: fsdp_config: special_tokens: bos_token: "" eos_token: "" unk_token: "" ```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_yhyhy3__open_llama_7b_v2_med_instruct) | Metric | Value | |-----------------------|---------------------------| | Avg. | 40.53 | | ARC (25-shot) | 46.5 | | HellaSwag (10-shot) | 76.91 | | MMLU (5-shot) | 42.32 | | TruthfulQA (0-shot) | 40.33 | | Winogrande (5-shot) | 69.3 | | GSM8K (5-shot) | 2.05 | | DROP (3-shot) | 6.29 |