marouni commited on
Commit
0cc91a3
1 Parent(s): cb714b5

chore(): update model summary

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md CHANGED
@@ -1,5 +1,9 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
4
  # Summary
5
 
@@ -7,3 +11,46 @@ An instruction-following large language model based on [pythia-70m](https://hugg
7
  with capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
8
 
9
  This model is an experiment in using small base model ([pythia-70m](https://huggingface.co/EleutherAI/pythia-70m)) to build a model similar to Databricks' [dolly model](https://huggingface.co/databricks/dolly-v2-12b).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - databricks/databricks-dolly-15k
5
+ language:
6
+ - en
7
  ---
8
  # Summary
9
 
 
11
  with capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
12
 
13
  This model is an experiment in using small base model ([pythia-70m](https://huggingface.co/EleutherAI/pythia-70m)) to build a model similar to Databricks' [dolly model](https://huggingface.co/databricks/dolly-v2-12b).
14
+
15
+ # Usage
16
+
17
+ To use the model with the transformers library, first make sure you have the transformers and accelerate libraries installed :
18
+ ```python
19
+ %pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"
20
+ ```
21
+
22
+ ```python
23
+ import torch
24
+ from transformers import pipeline
25
+
26
+ generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
27
+
28
+ res = generate_text("What is the capital of France ?")
29
+
30
+ print(res[0]["generated_text"])
31
+
32
+ ```
33
+
34
+ # Training
35
+
36
+ The model was trained using [Databricks' 15k instruction](https://huggingface.co/datasets/databricks/databricks-dolly-15k) on a recent Dell PC with 32G of RAM with a core i7 CPU.
37
+ The training took around 12 hours !
38
+
39
+ # Accuracy
40
+
41
+ As expected the model performance is very bad ! Especially when compared to [Databricks dolly v2 12b model](https://huggingface.co/databricks/dolly-v2-12b).
42
+
43
+ When prompted with `What is the captial of France ?`, the model answers with :
44
+ ```
45
+ "The World". It is an artwork for "working time" called «The Middle East Today". It comes from Paris, Belgium, in local variation, including large cities as described in English language photographs which portray a crescent and sunrise of late note, Bangourt before Paris.
46
+ “Countries like Pakistan and throughout East Africa close to Australia have constructed a watered havock which can be felt ever longer. Bombardment and booby traps tend to occupy space by wind and water, as were effectively used for material and equipment which have a green signal leading in the images."
47
+ ```
48
+
49
+ Compared with the following asnwer from [Databricks dolly v2 3b model](https://huggingface.co/databricks/dolly-v2-12b)
50
+ ```
51
+ The capital of France is Paris.
52
+ ```
53
+
54
+ # Conclusion
55
+ The accuracy between the base model used in this model (pythia-70m) and the base models used by Databricks (pythia-2.8b and pythia-12b) is huge ! And it makes all the difference in terms of accuracy.
56
+ The only thing worth mentioning here is the model's size, at around 160M it's orders of magnitude smaller than the Databricks ones.