--- license_name: microsoft-research-license license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE language: - en pipeline_tag: text-generation base_model: microsoft/phi-2 tags: - generated_from_trainer model-index: - name: phi-2-instruct results: [] datasets: - HuggingFaceH4/ultrachat_200k library_name: adapter-transformers license: mit --- # phi-2-instruct This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the filtered ultrachat200k dataset using the SFT technique. ## Model description More information about the model architecture and specific modifications made during fine-tuning is needed. ## Intended uses & limitations More information about the intended use cases and any limitations of the model is needed. ## Training and evaluation data More information about the datasets used for training and evaluation is needed. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 4 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - training_steps: 51967 ### Training results Detailed training results and performance metrics are not provided. It's recommended to reach out to the model creator for more information. ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.15.0 - Tokenizers 0.15.0 ## Evaluation and Inference Example - For an evaluation of the model and an inference example, refer to the [Inference Notebook](https://huggingface.co/venkycs/phi-2-instruct/blob/main/inference_phi_2_instruct.ipynb). ## Full Training Metrics on TensorBoard View the full training metrics on TensorBoard [here](https://huggingface.co/venkycs/phi-2-instruct/tensorboard). ## Author's LinkedIn Profile [venkycs](https://linkedin.com/in/venkycs)