File size: 2,909 Bytes
f8b864a
 
0cc91a3
 
 
 
0d647e8
 
 
 
f8b864a
f8a5605
 
 
 
 
 
0cc91a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bbfded3
0cc91a3
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
license: apache-2.0
datasets:
- databricks/databricks-dolly-15k
language:
- en
widget:
  - text: 'What is the capital of France ?'
    example_title: Basic question
    group: Python
---
# Summary

An instruction-following large language model based on [pythia-70m](https://huggingface.co/EleutherAI/pythia-70m) and trained on [Databricks' 15k instruction](https://huggingface.co/datasets/databricks/databricks-dolly-15k) 
with capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.

This model is an experiment in using small base model ([pythia-70m](https://huggingface.co/EleutherAI/pythia-70m)) to build a model similar to Databricks' [dolly model](https://huggingface.co/databricks/dolly-v2-12b).

# Usage

To use the model with the transformers library, first make sure you have the transformers and accelerate libraries installed :
```python
%pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"
```

```python
import torch
from transformers import pipeline

generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")

res = generate_text("What is the capital of France ?")

print(res[0]["generated_text"])

```

# Training

The model was trained using [Databricks' 15k instruction](https://huggingface.co/datasets/databricks/databricks-dolly-15k) on a recent Dell PC with 32G of RAM with a core i7 CPU.
The training took around 12 hours !

# Accuracy

As expected the model performance is very bad ! Especially when compared to [Databricks dolly v2 12b model](https://huggingface.co/databricks/dolly-v2-12b).

When prompted with `What is the capital of France ?`, the model answers with :
```
"The World". It is an artwork for "working time" called «The Middle East Today". It comes from Paris, Belgium, in local variation, including large cities as described in English language photographs which portray a crescent and sunrise of late note, Bangourt before Paris.
“Countries like Pakistan and throughout East Africa close to Australia have constructed a watered havock which can be felt ever longer. Bombardment and booby traps tend to occupy space by wind and water, as were effectively used for material and equipment which have a green signal leading in the images."
```

Compared with the following asnwer from [Databricks dolly v2 3b model](https://huggingface.co/databricks/dolly-v2-12b)
```
The capital of France is Paris.
```

# Conclusion
The accuracy between the base model used in this model (pythia-70m) and the base models used by Databricks (pythia-2.8b and pythia-12b) is huge ! And it makes all the difference in terms of accuracy.
The only thing worth mentioning here is the model's size, at around 160M it's orders of magnitude smaller than the Databricks ones.