matthayes commited on
Commit
af3b4a2
1 Parent(s): 79f6639

Update README.md

Browse files

Expanded usage instructions

Files changed (1) hide show
  1. README.md +22 -7
README.md CHANGED
@@ -24,27 +24,42 @@ on a [~15K record instruction corpus](https://github.com/databrickslabs/dolly/tr
24
 
25
  ## Usage
26
 
27
- To use the model with the `transformers` library on a machine with GPUs:
 
28
 
29
  ```
 
 
 
 
 
 
 
 
 
 
30
  from transformers import pipeline
31
 
32
- instruct_pipeline = pipeline(model="databricks/dolly-v2-12b", trust_remote_code=True, device_map="auto")
33
  ```
34
 
35
  You can then use the pipeline to answer instructions:
36
 
37
  ```
38
- instruct_pipeline("Explain to me the difference between nuclear fission and fusion.")
39
  ```
40
 
41
- To reduce memory usage you can load the model with `bfloat16`:
 
42
 
43
  ```
44
- import torch
45
- from transformers import pipeline
 
 
 
46
 
47
- instruct_pipeline = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
48
  ```
49
 
50
 
 
24
 
25
  ## Usage
26
 
27
+ To use the model with the `transformers` library on a machine with GPUs, first make sure you have the `transformers` and `accelerate` libraries installed.
28
+ In a Databricks notebook you could run:
29
 
30
  ```
31
+ %pip install accelerate>=0.12.0 transformers[torch]==4.25.1
32
+ ```
33
+
34
+ The instruction following pipeline can be loaded using the `pipeline` function as shown below. This loads a custom `InstructionTextGenerationPipeline`
35
+ found in the model repo [here](https://huggingface.co/databricks/dolly-v2-12b/blob/main/instruct_pipeline.py), which is why `trust_remote_code=True` is required.
36
+ Including `torch_dtype=torch.bfloat16` is generally recommended if this type is supported in order to reduce memory usage. It does not appear to impact output quality.
37
+ It is also fine to remove it if there is sufficient memory.
38
+
39
+ ```
40
+ import torch
41
  from transformers import pipeline
42
 
43
+ generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
44
  ```
45
 
46
  You can then use the pipeline to answer instructions:
47
 
48
  ```
49
+ generate_text("Explain to me the difference between nuclear fission and fusion.")
50
  ```
51
 
52
+ Alternatively, if you prefer to not use `trust_remote_code=True` you can download [instruct_pipeline.py](https://huggingface.co/databricks/dolly-v2-12b/blob/main/instruct_pipeline.py),
53
+ store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer:
54
 
55
  ```
56
+ from instruct_pipeline import InstructionTextGenerationPipeline
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer
58
+
59
+ tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-12b", padding_side="left")
60
+ model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-12b", device_map="auto")
61
 
62
+ generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)
63
  ```
64
 
65