Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ programming_language:
|
|
10 |
- JavaScript
|
11 |
- Python
|
12 |
- Rust
|
13 |
-
-
|
14 |
- C++
|
15 |
- C
|
16 |
- C#
|
@@ -58,10 +58,10 @@ datasets:
|
|
58 |
- bigcode/starcoderdata
|
59 |
---
|
60 |
|
61 |
-
# Model Card for DeciCoder
|
62 |
|
63 |
-
DeciCoder
|
64 |
-
trained on the Python, Java, Javascript,
|
65 |
The model uses variable Grouped Query Attention and has a context window of 4096
|
66 |
tokens. It was trained using a Fill-in-the-Middle training objective. The model's
|
67 |
architecture was generated by Deci's proprietary Neural Architecture
|
@@ -70,10 +70,17 @@ Search-based technology, AutoNAC.
|
|
70 |
## Model Details
|
71 |
|
72 |
- **Developed by:** Deci
|
73 |
-
- **Model type:** DeciCoder is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention.
|
74 |
-
- **Language(s):** Python, Java, JavaScript,
|
75 |
- **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
## Model Architecture
|
78 |
|
79 |
| Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | Hidden Size |
|
@@ -81,12 +88,12 @@ Search-based technology, AutoNAC.
|
|
81 |
| 6B | 32 | 32 | 4096 | Variable | 4096 | |
|
82 |
|
83 |
|
84 |
-
- **Decoder layer:** Variable Grouped Query Attention
|
85 |
- **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
|
86 |
|
87 |
## Uses
|
88 |
|
89 |
-
The model is intended to
|
90 |
context window of up to 4096k tokens. It is *not* an instruction model
|
91 |
and commands like \"Write a function that computes the absolute value of
|
92 |
an integer,\" won't yield the desired results. A more effective approach
|
@@ -114,8 +121,8 @@ print(tokenizer.decode(outputs[0]))
|
|
114 |
|
115 |
### Attribution
|
116 |
|
117 |
-
DeciCoder was trained on StarCoder Training Dataset, filtered for
|
118 |
-
Python, Java, JavaScript,
|
119 |
refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata).
|
120 |
|
121 |
```
|
@@ -123,34 +130,28 @@ refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://hugging
|
|
123 |
### Limitations
|
124 |
|
125 |
The model has undergone training with source code from Python, Java,
|
126 |
-
JavaScript,
|
127 |
contain other languages. Therefore, the model can produce code snippets
|
128 |
-
given some context. However, there
|
129 |
code will function as expected. It might be suboptimal, contain bugs, or
|
130 |
even exploits.
|
131 |
|
132 |
## Evaluation
|
133 |
|
134 |
-
Below are DeciCoder's pass@1 on MultiPL HumanEval scores
|
135 |
|
136 |
-
| Python | JavaScript | Java | C++ | C# | Rust | Go |
|
137 |
-
|
138 |
-
| 33.
|
139 |
|
140 |
|
141 |
### Runtime Benchmarks
|
142 |
|
143 |
-
|Inference Tool
|
144 |
-
|
145 |
-
|
|
146 |
|
147 |
-
-
|
148 |
-
|
149 |
-
## Documentation
|
150 |
-
|
151 |
-
- [Notebook](https://colab.research.google.com/drive/1JCxvBsWCZKHfIcHSMVf7GZCs3ClMQPjs) CHANGE
|
152 |
-
- Blog post: [Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation](https://deci.ai/blog/decicoder-efficient-and-accurate-code-generation-llm/)CHANGE
|
153 |
-
- Questions:Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/)CHANGE
|
154 |
|
155 |
## How to Cite
|
156 |
|
@@ -158,9 +159,9 @@ Please cite this model using this format.
|
|
158 |
|
159 |
```bibtex
|
160 |
@misc{DeciFoundationModels,
|
161 |
-
title = {DeciCoder},
|
162 |
author = {DeciAI Research Team},
|
163 |
year = {2023}
|
164 |
-
url={[https://huggingface.co/deci/decicoder-
|
165 |
}
|
166 |
-
```
|
|
|
10 |
- JavaScript
|
11 |
- Python
|
12 |
- Rust
|
13 |
+
- Ruby
|
14 |
- C++
|
15 |
- C
|
16 |
- C#
|
|
|
58 |
- bigcode/starcoderdata
|
59 |
---
|
60 |
|
61 |
+
# Model Card for DeciCoder-6B
|
62 |
|
63 |
+
DeciCoder-6B is a 6 billion parameter decoder-only code completion model
|
64 |
+
trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
|
65 |
The model uses variable Grouped Query Attention and has a context window of 4096
|
66 |
tokens. It was trained using a Fill-in-the-Middle training objective. The model's
|
67 |
architecture was generated by Deci's proprietary Neural Architecture
|
|
|
70 |
## Model Details
|
71 |
|
72 |
- **Developed by:** Deci
|
73 |
+
- **Model type:** DeciCoder-6B is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention.
|
74 |
+
- **Language(s):** Python, Java, JavaScript, Ruby, Rust, C++, C, C#
|
75 |
- **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
76 |
|
77 |
+
## Documentation
|
78 |
+
|
79 |
+
- Google Colab [Notebook](https://colab.research.google.com/drive/1ZxG9qMlom9vn4lSGlD8PrjwHBvag94ei?usp=sharing)
|
80 |
+
- Blog Post: [Introducing DeciCoder-6B: The Best Multi-Language Code Generation LLM in Its Class](https://deci.ai/blog/decicoder-6b-the-best-multi-language-code-generation-llm-in-its-class/)
|
81 |
+
- Tutorial: [How to Run DeciCoder-6B on Qualcomm AI 100](https://github.com/quic/cloud-ai-sdk/tree/1.12/models/language_processing/decoder)
|
82 |
+
- Questions: Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/)
|
83 |
+
|
84 |
## Model Architecture
|
85 |
|
86 |
| Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | Hidden Size |
|
|
|
88 |
| 6B | 32 | 32 | 4096 | Variable | 4096 | |
|
89 |
|
90 |
|
91 |
+
- **Decoder layer:** Variable Grouped Query Attention
|
92 |
- **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
|
93 |
|
94 |
## Uses
|
95 |
|
96 |
+
The model is intended to perform single/multiline code completion from a
|
97 |
context window of up to 4096k tokens. It is *not* an instruction model
|
98 |
and commands like \"Write a function that computes the absolute value of
|
99 |
an integer,\" won't yield the desired results. A more effective approach
|
|
|
121 |
|
122 |
### Attribution
|
123 |
|
124 |
+
DeciCoder-6B was trained on StarCoder Training Dataset, filtered for
|
125 |
+
Python, Java, JavaScript, Ruby, RUST, C++, C, and C#. For additional information, please
|
126 |
refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata).
|
127 |
|
128 |
```
|
|
|
130 |
### Limitations
|
131 |
|
132 |
The model has undergone training with source code from Python, Java,
|
133 |
+
JavaScript, Ruby, RUST, C++, C, and C#. While the primary language in the source is English, it does
|
134 |
contain other languages. Therefore, the model can produce code snippets
|
135 |
+
given some context. However, there is no assurance that the resulting
|
136 |
code will function as expected. It might be suboptimal, contain bugs, or
|
137 |
even exploits.
|
138 |
|
139 |
## Evaluation
|
140 |
|
141 |
+
Below are DeciCoder-6B's pass@1 on MultiPL HumanEval scores
|
142 |
|
143 |
+
| Python | JavaScript | Java | C++ | C# | Rust | Go |
|
144 |
+
|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
|
145 |
+
| 33.3% | 29.3% | 30.3% |29.93% |20.31% |20.5% |77.47% |
|
146 |
|
147 |
|
148 |
### Runtime Benchmarks
|
149 |
|
150 |
+
|Inference Tool | Hardware | Prompt Length | Generation Length | Throughput (tokens/sec) |
|
151 |
+
|:----------|:----------|:----------|:----------|:----------|
|
152 |
+
| Qualcomm SDK | Qualcomm AI 100 | 1024 | 1024 | 531.3 |
|
153 |
|
154 |
+
- Measured for maximal batch size on the device
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
|
156 |
## How to Cite
|
157 |
|
|
|
159 |
|
160 |
```bibtex
|
161 |
@misc{DeciFoundationModels,
|
162 |
+
title = {DeciCoder-6B},
|
163 |
author = {DeciAI Research Team},
|
164 |
year = {2023}
|
165 |
+
url={[https://huggingface.co/deci/decicoder-6B](https://huggingface.co/deci/decicoder-6B)},
|
166 |
}
|
167 |
+
```
|