Update README.md
Browse files
README.md
CHANGED
@@ -97,7 +97,35 @@ This would have the form {some prelude text here} \<INFILLING LOCATION\> {some t
|
|
97 |
|
98 |
The way to perform infilling generation would be via placing the input text into this format:
|
99 |
|
100 |
-
\<SUF\> {some text following cursor} \<PRE\> {some prelude text here} \<MID\> ...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
101 |
|
102 |
|
103 |
## Intended Uses and Limitations
|
|
|
97 |
|
98 |
The way to perform infilling generation would be via placing the input text into this format:
|
99 |
|
100 |
+
\<SUF\> {some text following cursor} \<PRE\> {some prelude text here} \<MID\> ...
|
101 |
+
|
102 |
+
|
103 |
+
language model output is generated after \<MID\> token!
|
104 |
+
|
105 |
+
|
106 |
+
As a concrete example, here is a code snippet that should allow a model to perform infilling:
|
107 |
+
|
108 |
+
```python
|
109 |
+
|
110 |
+
|
111 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
112 |
+
|
113 |
+
|
114 |
+
tokenizer = AutoTokenizer.from_pretrained("CarperAI/FIM-NeoX-1.3B")
|
115 |
+
model = AutoModelForCausalLM.from_pretrained("CarperAI/FIM-NeoX-1.3B")
|
116 |
+
|
117 |
+
prelude = "this is some text preceding the cursor,"
|
118 |
+
suffix = "and this is some text after it."
|
119 |
+
|
120 |
+
|
121 |
+
model_tokenized_input = [50253, *tokenizer(suffix), 50254, *tokenizer(prefix), 50255]
|
122 |
+
|
123 |
+
infilled = model.generate(model_tokenized_input)
|
124 |
+
|
125 |
+
|
126 |
+
```
|
127 |
+
|
128 |
+
We are working on making a better interface for this in future model releases or updates to the tokenizer.
|
129 |
|
130 |
|
131 |
## Intended Uses and Limitations
|