--- license: apache-2.0 --- # CSC T5 - T5 for Traditional Chinese Spelling Correction This model was obtained by `instruction-tuning` the corresponding `ClueAI/PromptCLUE-base-v1-5` model on the spelling error corpus. ## Model Details ### Model Description - Language(s) (NLP): `Chinese` - Pretrained from model: `ClueAI/PromptCLUE-base-v1-5` - Pretrained by dataset: `1M UDN news corpus` - Finetuned by dataset: `shibing624/CSC` spelling error corpus ### Model Sources - Repository: [https://github.com/TedYeh/Chinese_spelling_Correction](https://github.com/TedYeh/Chinese_spelling_Correction) ## Usage ```python from transformers import AutoTokenizer, T5ForConditionalGeneration tokenizer = AutoTokenizer.from_pretrained("CodeTed/traditional_CSC_t5") model = T5ForConditionalGeneration.from_pretrained("CodeTed/traditional_CSC_t5") input_text = '糾正句子裡的錯字: 為了降低少子化,政府可以堆動獎勵生育的政策。' input_ids = tokenizer(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids, max_length=256) edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ### Related Project [CodeTed/CGEDit](https://huggingface.co/CodeTed/CGEDit)