AI Detect Model

Model Description

GitHub repository for exploring the source code and additional resources: https://github.com/wangkevin02/USP

The AI Detect Model is a binary classification model designed to determine whether a given text is AI-generated (label=1) or written by a human (label=0). This model plays a crucial role in providing AI detection rewards, helping to prevent reward hacking during Reinforcement Learning with Cycle Consistency (RLCC). For more details, please refer to our paper.

This model is built upon the Longformer architecture and trained using our proprietary LMSYS-USP dataset. Specifically, in a dialogue context, texts generated by the assistant are labeled as AI-generated (label=1), while user-generated texts are assigned the opposite label (label=0).

Note: Our model is subject to the following constraints:

Maximum Context Length: Supports up to 4,096 tokens. Exceeding this may degrade performance; keep inputs within this limit for best results.

Language Limitation: Optimized for English. Non-English performance may vary due to limited training data.

Quick Start

You can utilize our AI detection model as demonstrated below:

from transformers import LongformerTokenizer, LongformerForSequenceClassification
import torch
import torch.nn.functional as F

class AIDetector:
    def __init__(self, model_name="allenai/longformer-base-4096", max_length=4096):
        """
        Initialize the AIDetector with a pretrained Longformer model and tokenizer.

        Args:
            model_name (str): The name or path of the pretrained Longformer model.
            max_length (int): The maximum sequence length for tokenization.
        """
        self.tokenizer = LongformerTokenizer.from_pretrained(model_name)
        self.model = LongformerForSequenceClassification.from_pretrained(model_name)
        self.model.eval()
        self.max_length = max_length
        self.tokenizer.padding_side = "right"

    @torch.no_grad()
    def get_probability(self, texts):
        inputs = self.tokenizer(texts, padding=True, truncation=True, max_length=self.max_length, return_tensors='pt')
        outputs = self.model(**inputs)
        probabilities = F.softmax(outputs.logits, dim=1)
        return probabilities

# Example usage
if __name__ == "__main__":
    classifier = AIDetector(model_name="/path/to/ai_detector")
    target_text = [
        "I am thinking about going away for vacation",
        "How can I help you today?"
        ]
    result = classifier.get_probability(target_text)
    print(result)
    # >>> Expected Output:
    # >>> tensor([[0.9954, 0.0046],
    # >>>         [0.0265, 0.9735]])

Citation

If you find this model useful, please cite:

@misc{wang2025knowbettermodelinghumanlike,
      title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles}, 
      author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li},
      year={2025},
      eprint={2502.18968},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18968}, 
}

wangkevin02
/

AI_Detect_Model

AI Detect Model

Model Description

Quick Start

Citation

Model tree for wangkevin02/AI_Detect_Model

Dataset used to train wangkevin02/AI_Detect_Model

Collection including wangkevin02/AI_Detect_Model

USP