File size: 6,697 Bytes
9748088 adcc310 9748088 adcc310 e3be56f adcc310 bbb16d4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
---
license: apache-2.0
datasets:
- aisquared/databricks-dolly-15k
language:
- en
library_name: transformers
---
# Model Card for `dlite-v2-355m`
<!-- Provide a quick summary of what the model is/does. -->
AI Squared's `dlite-v2-355m` is a large language
model which is derived from OpenAI's medium [GPT-2](https://huggingface.co/gpt2-medium) model and fine-tuned on a single GPU on a corpus of 15k records
([Databricks' "Dolly 15k" Dataset](https://huggingface.co/datasets/aisquared/databricks-dolly-15k)) to help it exhibit chat-based capabilities.
Just like [Databricks' Dolly V2 models](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm),
`dlite-v2-355m` (and all other members of the `dlite-v2` family) is licensed for both **research and commercial use.** We are extremely grateful
for the work that Databricks has done to create the `databricks-dolly-15k` dataset, for without it we would not be able to create and release this
model under such an open and permissive license.
While `dlite-v2-355m` is **not a state-of-the-art model**, we believe that the level of interactivity that can be achieved on such a small model that is trained so cheaply
is important to showcase, as it continues to demonstrate that creating powerful AI capabilities may be much more accessible than previously thought.
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** AI Squared, Inc.
- **Shared by:** AI Squared, Inc.
- **Model type:** Large Language Model
- **Language(s) (NLP):** EN
- **License:** Apache v2.0
- **Finetuned from model:** GPT-2
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
**`dlite-v2-355m` is not a state-of-the-art language model.** `dlite-v2-355m` is an experimental technology, and as with any experimental technology,
AI Squared urges potential users of this technology to test its capabilities thoroughly before usage.
Furthermore, the model can sometimes exhibit undesired behaviors. Some of these behaviors include,
but are not limited to: factual inaccuracies, biases, offensive responses, toxicity, and hallucinations.
Just as with any other LLM, we advise users of this technology to exercise good judgment when applying this technology.
## Usage
The code below shows how to use `dlite-v2-355m` in the way which it was trained. While the model can be used "out of the box" using the
`transformers` library, using the function defined below to create a response from the model will achieve better results.
### Load Model and Tokenizer from this Repository Using the `transformers` Package
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
import re
model_id = 'aisquared/dlite-v2-355m'
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side = 'left')
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code = True, device_map = 'auto')
```
### Create the Prompt Format and Other Variables
```python
PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:
"""
END_KEY = '### End'
RESPONSE_KEY = '### Response:\n'
```
### Create a Function to Retrieve a Response
```python
def create_response(
instruction,
model,
tokenizer,
do_sample = True,
max_new_tokens = 256,
top_p = 0.92,
top_k = 0,
**kwargs
):
"""
Create a response from the model by using a formatted prompt
"""
input_ids = tokenizer(
PROMPT.format(instruction=instruction), return_tensors="pt"
).input_ids
gen_tokens = model.generate(
input_ids,
pad_token_id=tokenizer.pad_token_id,
do_sample=do_sample,
max_new_tokens=max_new_tokens,
top_p=top_p,
top_k=top_k,
**kwargs,
)
decoded = tokenizer.batch_decode(gen_tokens)[0]
# The response appears after "### Response:". The model has been trained to append "### End" at the end.
m = re.search(r"#+\s*Response:\s*(.+?)#+\s*End", decoded, flags=re.DOTALL)
response = None
if m:
response = m.group(1).strip()
else:
# The model might not generate the "### End" sequence before reaching the max tokens. In this case, return
# everything after "### Response:".
m = re.search(r"#+\s*Response:\s*(.+)", decoded, flags=re.DOTALL)
if m:
response = m.group(1).strip()
else:
pass
return response
```
### Model Performance Metrics
We present the results from various model benchmarks on the EleutherAI LLM Evaluation Harness for all models in the DLite family.
Model results are sorted by mean score, ascending, to provide an ordering. These metrics serve to further show that none of the DLite models are
state of the art, but rather further show that chat-like behaviors in LLMs can be trained almost independent of model size.
| model | openbookqa | arc_easy | winogrande | hellaswag | arc_challenge | piqa | boolq |
|:--------------|-------------:|-----------:|-------------:|------------:|----------------:|---------:|---------:|
| gpt2 | 0.164 | 0.438131 | 0.51618 | 0.289185 | 0.190273 | 0.628945 | 0.487156 |
| dlite-v2-124m | 0.174 | 0.44697 | 0.502762 | 0.291974 | 0.192833 | 0.631665 | 0.520183 |
| dlite-v1-124m | 0.17 | 0.462542 | 0.494081 | 0.293268 | 0.223549 | 0.622416 | 0.502446 |
| gpt2-medium | 0.186 | 0.490741 | 0.531176 | 0.333101 | 0.215017 | 0.676279 | 0.585933 |
| dlite-v2-355m | 0.206 | 0.493687 | 0.524073 | 0.334993 | 0.226109 | 0.670838 | 0.582263 |
| dlite-v1-355m | 0.216 | 0.507576 | 0.496448 | 0.338478 | 0.234642 | 0.664309 | 0.600306 |
| gpt2-large | 0.194 | 0.531566 | 0.553275 | 0.363971 | 0.216724 | 0.703482 | 0.604893 |
| dlite-774m-v2 | 0.212 | 0.539562 | 0.5588 | 0.365565 | 0.234642 | 0.700218 | 0.60367 |
| dlite-774m-v1 | 0.218 | 0.545875 | 0.562747 | 0.375124 | 0.250853 | 0.698041 | 0.614985 |
| gpt2-xl | 0.224 | 0.582912 | 0.583268 | 0.400418 | 0.25 | 0.708379 | 0.617737 |
| dlite-v1-1.5b | 0.226 | 0.588384 | 0.584846 | 0.401414 | 0.268771 | 0.708379 | 0.624159 |
| dlite-v2-1.5b | 0.226 | 0.59596 | 0.581689 | 0.40719 | 0.273891 | 0.705114 | 0.630887 |
|