## 📖 Introduction **Instruction-Tagger** is a powerful model for labeling instructions with task tags. It allows users to easily adjust the proportion of tasks in a dataset. #### Example Input >What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?" #### Example Output >Medicine ## 🚀 Quick Start Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents. ```python import torch from transformers import DebertaV2Tokenizer,DebertaV2ForSequenceClassification, Trainer, TrainingArguments model = DebertaV2ForSequenceClassification.from_pretrained('deberta_cls', num_labels=33).cuda() tokenizer = DebertaV2Tokenizer.from_pretrained('alibaba-pai/Instruction-Tagger') labels={14: 'Writting', 0: 'Common-Sense', 28: 'Ecology', 22: 'Medicine', 17: 'Grammar', 3: 'Code Generation', 31: 'Others', 20: 'Paraphrase', 19: 'Economy', 6: 'Code Debug', 21: 'Reasoning', 18: 'Computer Science', 4: 'Technology', 13: 'Math', 32: 'Literature', 26: 'Chemistry', 15: 'Complex Format', 25: 'Ethics', 27: 'Multilingual', 29: 'Roleplay', 30: 'Entertainment', 23: 'Biology', 16: 'Art', 10: 'Academic Writing', 24: 'Health', 11: 'Philosophy', 5: 'Sport', 1: 'History', 12: 'Music', 7: 'Toxicity', 2: 'Law', 9: 'Physics', 8: 'Counterfactual'} def task_cls(pp): inputs = tokenizer(pp, return_tensors="pt",padding=True).to("cuda") with torch.no_grad(): logits = model(**inputs).logits predicted_class_id = logits.argmax().item() return labels[predicted_class_id] instruct=""" What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?" """ tag=task_cls(instruct) ``` ## 🔍 Evaluation To assess the accuracy of task classification, we manually evaluate a sample set of 100 entries (not in the training set), resulting in a classification precision of 92%. ## 📜 Citation If you find our work helpful, please cite it! ``` @misc{TAPIR, title={Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning}, author={Yuanhao Yue and Chengyu Wang and Jun Huang and Peng Wang}, year={2024}, eprint={2405.13448}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2405.13448}, } ```