jrc
/

llama3-8b-coedit

grammar-correction

Model card Files Files and versions Community

llama3-8b-coedit / README.md

jrc's picture

jrc

Update README.md

172f473 verified 7 months ago

|

747 Bytes

	---
	license: apache-2.0
	datasets:
	- grammarly/coedit
	language:
	- en
	metrics:
	- accuracy
	tags:
	- torchtune
	- grammar-correction
	---


	### Llama3 CoEdit

	This is a Llama3 8B based model trained using [torchtune](https://pytorch.org/torchtune) on the `grammarly/coedit` dataset.

	### Training details

	The exact training script (`lora_finetune_distributed`) and config (`8B_lora.yaml`) are both included in this repository. Specifically, in order to add the dataset, I added the following lines to the config:

	```
	dataset:
	_component_: torchtune.datasets.instruct_dataset
	source: grammarly/coedit
	template: GrammarErrorCorrectionTemplate
	column_map: {"sentence": "src", "output": "tgt"}
	train_on_input: False
	split: train
	```

	### Evaluation results