File size: 2,643 Bytes

4f8bc7f
8c529a7
 
3ac2b03
67bcc4f
8c529a7
67bcc4f
 
 
 
6b0e28f
f36537f
67bcc4f
 
 
 
4f8bc7f
67bcc4f
 
98cf383
67bcc4f
1c9f364
 
 
67bcc4f
 
 
 
 
 
 
 
98cf383
67bcc4f
 
98cf383
67bcc4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c9f364
67bcc4f
e3e7026
 
 
67bcc4f
 
 
 
e3e7026
 
0fccc61
67bcc4f
 
c90f060
67bcc4f
 
 
 
 
 
 
 
 
98cf383
 
 
 
 
 
 
 
 
 
67bcc4f
 
f36537f

---
language:
- en
license: cc-by-nc-4.0
datasets:
- facebook/asset
- wi_locness
- GEM/wiki_auto_asset_turk
- discofuse
- zaemyung/IteraTeR_plus
- jfleg
- grammarly/coedit
metrics:
- sari
- bleu
- accuracy
---
# Model Card for CoEdIT-xl-composite

This model was obtained by fine-tuning the corresponding `google/flan-t5-xl` model on the CoEdIT-Composite dataset. Details of the dataset can be found in our paper and repository.

**Paper:** CoEdIT: Text Editing by Task-Specific Instruction Tuning

**Authors:** Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang

## Model Details

### Model Description

- **Language(s) (NLP)**: English
- **Finetuned from model:** google/flan-t5-xl

### Model Sources

- **Repository:** https://github.com/vipulraheja/coedit
- **Paper:** https://arxiv.org/abs/2305.09857

## How to use
We make available the models presented in our paper. 

<table>
  <tr>
    <th>Model</th>
    <th>Number of parameters</th>
  </tr>
  <tr>
    <td>CoEdIT-large</td>
    <td>770M</td>
  </tr>
  <tr>
    <td>CoEdIT-xl</td>
    <td>3B</td>
  </tr>
  <tr>
    <td>CoEdIT-xxl</td>
    <td>11B</td>
  </tr>  
</table>


## Uses

## Text Revision Task
Given an edit instruction and an original text, our model can generate the edited version of the text.<br>

![task_specs](https://huggingface.co/grammarly/coedit-xl/resolve/main/task_examples.png)

This model can also perform edits on composite instructions, as shown below:
![composite task_specs](https://huggingface.co/grammarly/coedit-xl-composite/resolve/main/composite_examples.png)

## Usage
```python
from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained("grammarly/coedit-xl-composite")
model = T5ForConditionalGeneration.from_pretrained("grammarly/coedit-xl-composite")
input_text = 'Fix grammatical errors in this sentence and make it simpler: When I grow up, I start to understand what he said is quite right.'
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
```


#### Software
https://github.com/vipulraheja/coedit

## Citation

**BibTeX:**
```
@article{raheja2023coedit,
      title={CoEdIT: Text Editing by Task-Specific Instruction Tuning}, 
      author={Vipul Raheja and Dhruv Kumar and Ryan Koo and Dongyeop Kang},
      year={2023},
      eprint={2305.09857},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

**APA:**
Raheja, V., Kumar, D., Koo, R., & Kang, D. (2023). CoEdIT: Text Editing by Task-Specific Instruction Tuning. ArXiv. /abs/2305.09857