Update README.md
Browse files
README.md
CHANGED
@@ -123,6 +123,21 @@ print(tokenizer.decode(outputs[0, input_length:], skip_special_tokens=True))
|
|
123 |
Using this template, each turn is preceded by a `<|im_start|>` delimiter and the role of the entity
|
124 |
(either `user`, for content supplied by the user, or `assistant` for LLM responses), and finished with the `<|im_end|>` token.
|
125 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
126 |
|
127 |
## Data
|
128 |
|
|
|
123 |
Using this template, each turn is preceded by a `<|im_start|>` delimiter and the role of the entity
|
124 |
(either `user`, for content supplied by the user, or `assistant` for LLM responses), and finished with the `<|im_end|>` token.
|
125 |
|
126 |
+
### Post-edition
|
127 |
+
|
128 |
+
For post-edition tasks you can try using the following prompt template:
|
129 |
+
|
130 |
+
```python
|
131 |
+
source = 'Catalan'
|
132 |
+
target = 'English'
|
133 |
+
source_sentence = 'Necessite saber qui son Rafael Nadal i Maria Magdalena.'
|
134 |
+
machine_translation = 'I need to know who is Rafael Christmas and Maria the Muffin.'
|
135 |
+
|
136 |
+
text = f"Please fix any mistakes in the following {source}-{target} machine translation or keep it unedited if it's correct.\nSource: {source_sentence} \nMT: {machine_translation} \nCorrected:"
|
137 |
+
|
138 |
+
# I need to know who is Rafael Nadal and Maria Magdalena.
|
139 |
+
```
|
140 |
+
|
141 |
|
142 |
## Data
|
143 |
|