File size: 2,939 Bytes
ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 6163030 ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b ea2c08d 1e4ac8b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
license: apache-2.0
language:
- en
pipeline_tag: summarization
widget:
- text: What is the peak phase of T-eV?
example_title: Question Answering
tags:
- arxiv
---
# Table of Contents
0. [TL;DR](#TL;DR)
1. [Model Details](#model-details)
2. [Usage](#usage)
3. [Uses](#uses)
4. [Citation](#citation)
# TL;DR
This is a Phi-1_5 model trained on [camel-ai/physics](https://huggingface.co/datasets/camel-ai/physics). This model is for research purposes only and ***should not be used in production settings***.
## Model Description
- **Model type:** Language model
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Related Models:** [Phi-1_5](https://huggingface.co/microsoft/phi-1_5)
# Usage
Find below some example scripts on how to use the model in `transformers`:
## Using the Pytorch model
```python
from huggingface_hub import notebook_login
from datasets import load_dataset, Dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model = "ArtifactAI/phi-physics"
model = AutoModelForCausalLM.from_pretrained(base_model, trust_remote_code= True)
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
def generate(prompt):
inputs = tokenizer(f'''Below is an instruction that describes a task. Write a response that appropriately completes the request If you are adding additional white spaces, stop writing".\n\n### Instruction:\n{prompt}.\n\n### Response:\n ''', return_tensors="pt", return_attention_mask=False)
streamer = TextStreamer(tokenizer, skip_prompt= True)
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=500)
generate("What are the common techniques used in identifying a new species, and how can scientists accurately categorize it within the existing taxonomy system?")
```
## Training Data
The model was trained on [camel-ai/phi-physics](https://huggingface.co/datasets/camel-ai/physics), a dataset of question/answer pairs.
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: float16
### Framework versions
- PEFT 0.6.2
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: float16
### Framework versions
- PEFT 0.6.2
# Citation
```
@misc{phi-math,
title={phi-biology},
author={Matthew Kenney},
year={2023}
}
```
|