Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,25 @@ metrics:
|
|
9 |
tags:
|
10 |
- torchtune
|
11 |
- grammar-correction
|
12 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
tags:
|
10 |
- torchtune
|
11 |
- grammar-correction
|
12 |
+
---
|
13 |
+
|
14 |
+
|
15 |
+
### Llama3 CoEdit
|
16 |
+
|
17 |
+
This is a Llama3 8B based model trained using [torchtune](https://pytorch.org/torchtune) on the `grammarly/coedit` dataset.
|
18 |
+
|
19 |
+
### Training details
|
20 |
+
|
21 |
+
The exact training script (`lora_finetune_distributed`) and config (`8B_lora.yaml`) are both included in this repository. Specifically, in order to add the dataset, I added the following lines to the config:
|
22 |
+
|
23 |
+
```
|
24 |
+
dataset:
|
25 |
+
_component_: torchtune.datasets.instruct_dataset
|
26 |
+
source: grammarly/coedit
|
27 |
+
template: GrammarErrorCorrectionTemplate
|
28 |
+
column_map: {"sentence": "src", "output": "tgt"}
|
29 |
+
train_on_input: False
|
30 |
+
split: train
|
31 |
+
```
|
32 |
+
|
33 |
+
### Evaluation results
|