pcuenq HF staff commited on
Commit
697f441
1 Parent(s): 8050903

code-example-fixed

Browse files
Files changed (1) hide show
  1. README.md +60 -7
README.md CHANGED
@@ -16,6 +16,13 @@ Code Llama is a collection of pretrained and fine-tuned generative text models r
16
  | 34B | [codellama/CodeLlama-34b-hf](https://huggingface.co/codellama/CodeLlama-34b-hf) | [codellama/CodeLlama-34b-Python-hf](https://huggingface.co/codellama/CodeLlama-34b-Python-hf) | [codellama/CodeLlama-34b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf) |
17
  | 70B | [codellama/CodeLlama-70b-hf](https://huggingface.co/codellama/CodeLlama-70b-hf) | [codellama/CodeLlama-70b-Python-hf](https://huggingface.co/codellama/CodeLlama-70b-Python-hf) | [codellama/CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) |
18
 
 
 
 
 
 
 
 
19
  ## Model Use
20
 
21
  Install `transformers`
@@ -24,14 +31,60 @@ Install `transformers`
24
  pip install transformers accelerate
25
  ```
26
 
27
- **Warning:** The 70B Instruct model has a different prompt template than the smaller versions. We'll update this repo soon.
28
-
29
- Model capabilities:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- - [x] Code completion.
32
- - [ ] Infilling.
33
- - [x] Instructions / chat.
34
- - [ ] Python specialist.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ## Model Details
37
  *Note: Use of this model is governed by the Meta license. Meta developed and publicly released the Code Llama family of large language models (LLMs).
 
16
  | 34B | [codellama/CodeLlama-34b-hf](https://huggingface.co/codellama/CodeLlama-34b-hf) | [codellama/CodeLlama-34b-Python-hf](https://huggingface.co/codellama/CodeLlama-34b-Python-hf) | [codellama/CodeLlama-34b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf) |
17
  | 70B | [codellama/CodeLlama-70b-hf](https://huggingface.co/codellama/CodeLlama-70b-hf) | [codellama/CodeLlama-70b-Python-hf](https://huggingface.co/codellama/CodeLlama-70b-Python-hf) | [codellama/CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) |
18
 
19
+ Model capabilities:
20
+
21
+ - [x] Code completion.
22
+ - [ ] Infilling.
23
+ - [x] Instructions / chat.
24
+ - [ ] Python specialist.
25
+
26
  ## Model Use
27
 
28
  Install `transformers`
 
31
  pip install transformers accelerate
32
  ```
33
 
34
+ **Chat use:** The 70B Instruct model uses a different prompt template than the smaller versions. To use it with `transformers`, we recommend you use the built-in chat template:
35
+
36
+ ```py
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+ import transformers
39
+ import torch
40
+
41
+ model_id = "codellama/CodeLlama-70b-Instruct-hf"
42
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
43
+ model = AutoModelForCausalLM.from_pretrained(
44
+ model_id,
45
+ torch_dtype=torch.float16,
46
+ device_map="auto",
47
+ )
48
+
49
+ chat = [
50
+ {"role": "system", "content": "You are a helpful and honest code assistant expert in JavaScript. Please, provide all answers to programming questions in JavaScript"},
51
+ {"role": "user", "content": "Write a function that computes the set of sums of all contiguous sublists of a given list."},
52
+ ]
53
+ inputs = tokenizer.apply_chat_template(chat, return_tensors="pt").to("cuda")
54
+
55
+ output = model.generate(input_ids=inputs, max_new_tokens=200)
56
+ output = output[0].to("cpu")
57
+ print(tokenizer.decode(output))
58
+ ```
59
 
60
+ You can also use the model for **text or code completion**. This examples uses transformers' `pipeline` interface:
61
+
62
+ ```py
63
+ from transformers import AutoTokenizer
64
+ import transformers
65
+ import torch
66
+
67
+ model_id = "codellama/CodeLlama-70b-hf"
68
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
69
+ pipeline = transformers.pipeline(
70
+ "text-generation",
71
+ model=model_id,
72
+ torch_dtype=torch.float16,
73
+ device_map="auto",
74
+ )
75
+
76
+ sequences = pipeline(
77
+ 'def fibonacci(',
78
+ do_sample=True,
79
+ temperature=0.2,
80
+ top_p=0.9,
81
+ num_return_sequences=1,
82
+ eos_token_id=tokenizer.eos_token_id,
83
+ max_length=100,
84
+ )
85
+ for seq in sequences:
86
+ print(f"Result: {seq['generated_text']}")
87
+ ```
88
 
89
  ## Model Details
90
  *Note: Use of this model is governed by the Meta license. Meta developed and publicly released the Code Llama family of large language models (LLMs).