File size: 2,545 Bytes
ab218e2
2bc3f5c
32f7b3e
 
 
ab218e2
 
 
32f7b3e
ab218e2
 
 
32f7b3e
2bc3f5c
32f7b3e
2bc3f5c
32f7b3e
2bc3f5c
32f7b3e
2bc3f5c
32f7b3e
2bc3f5c
 
32f7b3e
 
 
 
 
 
 
da23173
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
title: StarCoder Demo
emoji: 💫
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: 3.28.3
app_file: app.py
pinned: true
duplicated_from: bigcode/bigcode-playground
---


# ⭐StarCoder Demo💫

## Code-Completion Playground 💻 with ⭐StarCoder Models

This is a demo playground to generate code with the power of ⭐[StarCoder](https://huggingface.co/bigcode/starcoder) a **15B** parameter model for code generation in **80+** programming languages.

ℹ️ This is not an instruction model but just a code completion tool.

🗣️For instruction and chatting you can chat with a prompted version of the model directly at the [HuggingFace🤗Chat💬(hf.co/chat)](https://huggingface.co/chat/?model=starcoder)

---

**Intended Use**: this app and its [supporting model](https://huggingface.co/bigcode/starcoder) are provided for demonstration purposes only; not to serve as a replacement for human expertise. For more details on the model's limitations in terms of factuality and biases, please refer to the source [model card](hf.co/bigcode)

⚠️ Any use or sharing of this demo constitutes your acceptance of the BigCode [OpenRAIL-M](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) License Agreement and the use restrictions included within.

---

## Model Formats

The model is pretrained on code and is formatted with special tokens in addition to the pure code data,\
such as prefixes specifying the source of the file or tokens separating code from a commit message.\
Use these templates to explore the model's capacities:

### 1. Prefixes 🏷️

For pure code files, use any combination of the following prefixes:

```xml
<reponame>REPONAME<filename>FILENAME<gh_stars>STARS\ncode<|endoftext|>
```

STARS can be one of: 0, 1-10, 10-100, 100-1000, 1000+

### 2. Commits 💾

The commits data is formatted as follows:

```xml
<commit_before>code<commit_msg>text<commit_after>code<|endoftext|>
```

### 3. Jupyter Notebooks 📓

The model is trained on Jupyter notebooks as Python scripts and structured formats like:

```xml
<start_jupyter><jupyter_text>text<jupyter_code>code<jupyter_output>output<jupyter_text>
```

### 4. Issues 🐛

We also trained on GitHub issues using the following formatting:

```xml
<issue_start><issue_comment>text<issue_comment>...<issue_closed>
```

### 5. Fill-in-the-middle 🧩

Fill in the middle requires rearranging the model inputs. The playground handles this for you - all you need is to specify where to fill:

```xml
code before<FILL_HERE>code after
```