Svngoku commited on
Commit
5aedbcc
1 Parent(s): badd836

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +174 -127
README.md CHANGED
@@ -4,197 +4,244 @@ base_model:
4
  - CohereForAI/c4ai-command-r7b-12-2024
5
  ---
6
 
7
- # Model Card for Model ID
8
 
9
- <!-- Provide a quick summary of what the model is/does. -->
10
 
 
11
 
 
12
 
13
- ## Model Details
 
 
 
 
14
 
15
- ### Model Description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- <!-- Provide a longer summary of what this model is. -->
18
 
19
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
20
 
21
- - **Developed by:** [More Information Needed]
22
- - **Funded by [optional]:** [More Information Needed]
23
- - **Shared by [optional]:** [More Information Needed]
24
- - **Model type:** [More Information Needed]
25
- - **Language(s) (NLP):** [More Information Needed]
26
- - **License:** [More Information Needed]
27
- - **Finetuned from model [optional]:** [More Information Needed]
28
 
29
- ### Model Sources [optional]
30
 
31
- <!-- Provide the basic links for the model. -->
32
 
33
- - **Repository:** [More Information Needed]
34
- - **Paper [optional]:** [More Information Needed]
35
- - **Demo [optional]:** [More Information Needed]
36
 
37
- ## Uses
 
 
38
 
39
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
 
 
40
 
41
- ### Direct Use
 
42
 
43
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
44
 
45
- [More Information Needed]
 
 
46
 
47
- ### Downstream Use [optional]
 
 
 
 
 
48
 
49
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
50
 
51
- [More Information Needed]
52
 
53
- ### Out-of-Scope Use
54
 
55
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
56
 
57
- [More Information Needed]
58
 
59
- ## Bias, Risks, and Limitations
60
 
61
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
62
 
63
- [More Information Needed]
64
 
65
- ### Recommendations
66
 
67
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
 
 
 
 
 
 
 
68
 
69
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
70
 
71
- ## How to Get Started with the Model
72
 
73
- Use the code below to get started with the model.
74
 
75
- [More Information Needed]
76
 
77
- ## Training Details
78
 
79
- ### Training Data
80
 
81
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
82
 
83
- [More Information Needed]
84
 
85
- ### Training Procedure
86
 
87
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
88
 
89
- #### Preprocessing [optional]
90
 
91
- [More Information Needed]
 
 
 
 
 
92
 
 
 
 
 
 
93
 
94
- #### Training Hyperparameters
 
 
 
 
95
 
96
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
97
 
98
- #### Speeds, Sizes, Times [optional]
99
 
100
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
101
 
102
- [More Information Needed]
103
 
104
- ## Evaluation
 
 
 
 
105
 
106
- <!-- This section describes the evaluation protocols and provides the results. -->
107
 
108
- ### Testing Data, Factors & Metrics
 
109
 
110
- #### Testing Data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
112
- <!-- This should link to a Dataset Card if possible. -->
 
113
 
114
- [More Information Needed]
 
115
 
116
- #### Factors
 
 
117
 
118
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
119
 
120
- [More Information Needed]
121
 
122
- #### Metrics
 
 
 
 
123
 
124
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
125
 
126
- [More Information Needed]
 
 
 
 
127
 
128
- ### Results
 
129
 
130
- [More Information Needed]
 
131
 
132
- #### Summary
133
 
 
134
 
 
135
 
136
- ## Model Examination [optional]
137
 
138
- <!-- Relevant interpretability work for the model goes here -->
139
 
140
- [More Information Needed]
141
 
142
- ## Environmental Impact
143
 
144
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
145
-
146
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
147
-
148
- - **Hardware Type:** [More Information Needed]
149
- - **Hours used:** [More Information Needed]
150
- - **Cloud Provider:** [More Information Needed]
151
- - **Compute Region:** [More Information Needed]
152
- - **Carbon Emitted:** [More Information Needed]
153
-
154
- ## Technical Specifications [optional]
155
-
156
- ### Model Architecture and Objective
157
-
158
- [More Information Needed]
159
-
160
- ### Compute Infrastructure
161
-
162
- [More Information Needed]
163
-
164
- #### Hardware
165
-
166
- [More Information Needed]
167
-
168
- #### Software
169
-
170
- [More Information Needed]
171
-
172
- ## Citation [optional]
173
-
174
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
175
-
176
- **BibTeX:**
177
-
178
- [More Information Needed]
179
-
180
- **APA:**
181
-
182
- [More Information Needed]
183
-
184
- ## Glossary [optional]
185
-
186
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
187
-
188
- [More Information Needed]
189
-
190
- ## More Information [optional]
191
-
192
- [More Information Needed]
193
-
194
- ## Model Card Authors [optional]
195
-
196
- [More Information Needed]
197
-
198
- ## Model Card Contact
199
-
200
- [More Information Needed]
 
4
  - CohereForAI/c4ai-command-r7b-12-2024
5
  ---
6
 
7
+ # **Model Card for C4AI Command R7B 4bit **
8
 
9
+ ## **Model Summary**
10
 
11
+ C4AI Command R7B is an open weights research release of a 7B billion parameter model with advanced capabilities optimized for a variety of use cases including reasoning, summarization, question answering, and code. The model is trained to perform sophisticated tasks including Retrieval Augmented Generation (RAG) and tool use. The model also has powerful agentic capabilities with the ability to use and combine multiple tools over multiple steps to accomplish more difficult tasks. It obtains top performance on enterprise relevant code use cases. C4AI Command R7B is a multilingual model trained on 23 languages.
12
 
13
+ Developed by: [Cohere](https://cohere.com/) and [Cohere For AI](https://cohere.for.ai/)
14
 
15
+ * Point of Contact: Cohere For AI: [cohere.for.ai](https://cohere.for.ai/)
16
+ * License: [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license), requires also adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy)
17
+ * Model: c4ai-command-r7b-12-2024
18
+ * Model Size: 7 billion parameters
19
+ * Context length: 128K
20
 
21
+ ```txt
22
+ Cohere2ForCausalLM(
23
+ (model): Cohere2Model(
24
+ (embed_tokens): Embedding(256000, 4096, padding_idx=0)
25
+ (layers): ModuleList(
26
+ (0-31): 32 x Cohere2DecoderLayer(
27
+ (self_attn): Cohere2Attention(
28
+ (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
29
+ (k_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
30
+ (v_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
31
+ (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
32
+ )
33
+ (mlp): Cohere2MLP(
34
+ (gate_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
35
+ (up_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
36
+ (down_proj): Linear4bit(in_features=14336, out_features=4096, bias=False)
37
+ (act_fn): SiLU()
38
+ )
39
+ (input_layernorm): Cohere2LayerNorm()
40
+ )
41
+ )
42
+ (norm): Cohere2LayerNorm()
43
+ (rotary_emb): Cohere2RotaryEmbedding()
44
+ )
45
+ (lm_head): Linear(in_features=4096, out_features=256000, bias=False)
46
+ (_cache): HybridCache()
47
+ )
48
+ ```
49
 
 
50
 
51
+ **Try C4AI Command R7B**
52
 
53
+ You can try out C4AI Command R7B before downloading the weights in our hosted [Hugging Face Space](https://cohereforai-c4ai-command.hf.space/models/command-r7b-12-2024).
 
 
 
 
 
 
54
 
 
55
 
56
+ **Usage**
57
 
58
+ Please install transformers from the source repository that includes the necessary changes for this model.
 
 
59
 
60
+ ```py
61
+ # !pip install -U "git+https://github.com/huggingface/transformers.git" bitsandbytes accelerate
62
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
63
 
64
+ # Configuration de la quantification
65
+ bnb_config = BitsAndBytesConfig(
66
+ load_in_4bit=True,
67
+ bnb_4bit_quant_type="nf4",
68
+ bnb_4bit_compute_dtype="float16",
69
+ bnb_4bit_use_double_quant=True
70
+ )
71
 
72
+ # ID du modèle
73
+ model_id = "CohereForAI/c4ai-command-r7b-12-2024"
74
 
75
+ # Chargement du tokenizer et du modèle
76
+ tokenizer = AutoTokenizer.from_pretrained(model_id, token="")
77
+ model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, token="")
78
 
79
+ # Format message with the c4ai-command-r7b-12-2024 chat template
80
+ messages = [{"role": "user", "content": "Hello, how are you?"}]
81
+ input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
82
 
83
+ gen_tokens = model.generate(
84
+ input_ids,
85
+ max_new_tokens=2048,
86
+ do_sample=True,
87
+ temperature=0.9,
88
+ )
89
 
90
+ gen_text = tokenizer.decode(gen_tokens[0], skip_special_tokens=True)
91
+ print(gen_text)
92
+ ```
93
 
94
+ ## **Model Details**
95
 
96
+ **Input**: Models input text only.
97
 
98
+ **Output**: Models generate text only.
99
 
100
+ **Model Architecture**: This is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety. The model features three layers with **sliding window attention** (window size 4096\) and **ROPE** for efficient local context modeling and relative positional encoding. A fourth layer uses **global attention** without positional embeddings, enabling unrestricted token interactions across the entire sequence.
101
 
102
+ **Languages covered**: The model has been trained on 23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.
103
 
104
+ Context length: Command R7B supports a context length of 128K.
105
 
106
+ ### A well-rounded model
107
 
108
+ Command R7B excels on standardized and externally verifiable benchmarks such as the [HuggingFace Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/). Compared to other similarly sized open-weights models, Command R7B ranks first with strong performance across all tasks.
109
 
110
+ | | Command R7B | Gemma 2 IT 9B | Ministral 8B | Llama 3.1 8B |
111
+ | :---- | :---- | :---- | :---- | :---- |
112
+ | Average | **31.4** | 28.9 | 22 | 28.2 |
113
+ | IFEval | 77.9 | 74.4 | 58.96 | **78.6** |
114
+ | BBH | 36.1 | **42.1** | 25.82 | 29.9 |
115
+ | MATH hard | **26.4** | 0.2 | 6.5 | 19.3 |
116
+ | GPQA | 7.7 | **14.8** | 4.5 | 2.4 |
117
+ | MUSR | **11.6** | 9.74 | 10.7 | 8.41 |
118
+ | MMLU-Pro | 28.5 | **32** | 25.5 | 30.7 |
119
 
120
+ *HuggingFace Leaderboard evaluation results. Competitor numbers are taken from the official leaderboard. Command R7B results are calculated by us using the official HuggingFace prompts and evaluation code.*
121
 
 
122
 
123
+ ### **Chat Capabilities:**
124
 
125
+ Command R7B can be configured as both a conversational model and an instruct model. The [conversational mode](https://docs.cohere.com/docs/command-r7b-hf) conditions the model on interactive behaviour, meaning it is expected to reply in a conversational fashion, provides introductory statements and follow-up questions, and uses Markdown as well as LaTeX where appropriate. It is optimized for interactive experiences, such as chatbots, where the model engages in dialogue.
126
 
127
+ The [instruct mode](https://docs.cohere.com/docs/command-r7b-hf), in contrast, conditions the model to provide concise yet comprehensive responses, and does not use Markdown / LaTeX by default. It is designed for non-interactive, task-focused use cases like extracting information, summarizing text, translation, and categorization.
128
 
129
+ **Note:** by default, Command R7B is delivered without a system preamble. We recommend to add the conversational or instruct preambles as [described in our docs](https://docs.cohere.com/docs/command-r7b-hf).
130
 
 
131
 
132
+ ### **RAG Capabilities:**
133
 
134
+ Command R7B has been trained specifically for tasks like the final step of Retrieval Augmented Generation (RAG).
135
 
136
+ RAG with Command R7B is supported through [chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating#advanced-retrieval-augmented-generation) in Transformers. The model takes a conversation as input (with an optional user-supplied system preamble), along with a list of document snippets.
137
 
 
138
 
139
+ <details>
140
+ <summary><b>RAG Example [CLICK TO EXPAND]</b></summary>
141
+
142
+ ```py
143
+ # Define conversation input
144
+ conversation = [{"role": "user", "content": "What has Man always dreamed of?"}]
145
 
146
+ # Define documents for retrieval-based generation
147
+ documents = [
148
+ {"heading": "The Moon: Our Age-Old Foe", "body": "Man has always dreamed of destroying the moon. In this essay, I shall..."},
149
+ {"heading": "Love is all you need", "body": "Man's dream has always been to find love. This profound lesson..."}
150
+ ]
151
 
152
+ # Get the RAG prompt
153
+ input_prompt = tokenizer.apply_chat_template(conversation=conversation, documents=documents, tokenize=False, add_generation_prompt=True, return_tensors="pt")
154
+ # Tokenize the prompt
155
+ input_ids = tokenizer.encode_plus(input_prompt, return_tensors="pt")
156
+ ```
157
 
158
+ You can then generate text from this input as normal.
159
 
160
+ Document snippets should be short chunks, rather than long documents, typically around 100-400 words per chunk, formatted as key-value pairs. The keys should be short descriptive strings, the values can be text or semi-structured.
161
 
162
+ You may find that simply including relevant documents directly in a user message works just as well, or better than using the documents parameter to render the special RAG template. The RAG template is generally a strong default. We encourage users to play with both, and to evaluate which mode works best for their specific use case.
163
+ </details>
164
 
165
+ Note that this was a very brief introduction to RAG \- for more information, see the Command R7B [prompt format docs](https://docs.cohere.com/docs/command-r7b-hf) and the Transformers [RAG documentation](https://huggingface.co/docs/transformers/main/chat_templating#advanced-retrieval-augmented-generation).
166
 
167
+ ### **Tool Use Capabilities:**
168
+ Command R7B has been specifically trained with conversational tool use capabilities. This allows the model to interact with external tools like APIs, databases, or search engines.
169
+ Instructions on how to leverage these capabilities in Hugging Face are coming soon.
170
+ <!--
171
+ Command R7B has been specifically trained with conversational tool use capabilities. This allows the model to interact with external tools like APIs, databases, or search engines.
172
 
173
+ Tool use with Command R7B is supported through [chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating#advanced-tool-use--function-calling) in Transformers. We recommend providing tool descriptions using JSON schema.
174
 
175
+ <details>
176
+ <summary><b>Tool Use Example [CLICK TO EXPAND]</b></summary>
177
 
178
+ ```py
179
+ tools = [
180
+ {
181
+ "type": "function",
182
+ "function": {
183
+ "name": "query_daily_sales_report",
184
+ "description": "Connects to a database to retrieve overall sales volumes and sales information for a given day.",
185
+ "parameters": {
186
+ "type": "object",
187
+ "properties": {
188
+ "day": {
189
+ "description": "Retrieves sales data for this day, formatted as YYYY-MM-DD.",
190
+ "type": "string",
191
+ }
192
+ },
193
+ "required": ["day"]
194
+ },
195
+ }
196
+ }
197
+ ]
198
 
199
+ # Define conversation input
200
+ conversation = [{"role": "user", "content": "Can you provide a sales summary for 29th September 2023?"}]
201
 
202
+ # Get the Tool Use prompt
203
+ input_prompt = tokenizer.apply_chat_template(conversation=conversation, tools=tools, tokenize=False, add_generation_prompt=True, return_tensors="pt")
204
 
205
+ # Tokenize the prompt
206
+ input_ids = tokenizer.encode_plus(input_prompt, return_tensors="pt")
207
+ ```
208
 
209
+ You can then generate text from this input as normal.
210
 
211
+ If the model generates a plan and tool calls, you should add them to the chat history like so:
212
 
213
+ ```py
214
+ tool_call = {"name": "query_daily_sales_report", "arguments": {"day": "2023-09-29"}}
215
+ tool_plan = "I will use the query_daily_sales_report tool to find the sales summary for 29th September 2023. I will then use the query_product_catalog tool to find the details about the products in the 'Electronics' category."
216
+ conversation.append({"role": "assistant", "tool_calls": [{ "id": "0", "type": "function", "function": tool_call},], "tool_plan": tool_plan})
217
+ ```
218
 
219
+ and then call the tool and append the result, with the tool role, like so:
220
 
221
+ ```py
222
+ api_response_for_query_daily_sales_report = SOME JSON RESPONSE
223
+ # Append tool results from tool call 0
224
+ conversation.append({"role": "tool", "tool_call_id": "0", "content": json.dumps(api_response_for_query_daily_sales_report)})
225
+ ```
226
 
227
+ After that, you can generate() again to let the model use the tool result in the chat.
228
+ </details>
229
 
230
+ Note that this was a very brief introduction to tool calling \- for more information, see the Command R7B [prompt format docs](https://docs.cohere.com/docs/command-r7b-hf) and the Transformers [tool use documentation](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling).
231
+ -->
232
 
233
+ ### **Code Capabilities:**
234
 
235
+ Command R7B has meaningfully improved on code capabilities. In addition to academic code benchmarks, we have evaluated it on enterprise-relevant scenarios, including SQL and code translation, where it outperforms other models of similar size. Try these out by requesting code snippets, code explanations, or code rewrites. For better performance, we also recommend using a low temperature (and even greedy decoding) for code-generation related instructions.
236
 
237
+ ## **Model Card Contact**
238
 
239
+ For errors or additional questions about details in this model card, contact info@for.ai.
240
 
241
+ ## **Terms of Use:**
242
 
243
+ We hope that the release of this model will make community-based research efforts more accessible, by releasing the weights of a highly performant 7 billion parameter model to researchers all over the world. This model is governed by a [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license) License with an acceptable use addendum, and also requires adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy).
244
 
245
+ ## **Try Chat:**
246
 
247
+ You can try Command R7B chat in the playground [here](https://dashboard.cohere.com/playground/chat). You can also use it in our dedicated Hugging Face Space [here](https://cohereforai-c4ai-command.hf.space/models/command-r7b-12-2024).