|
--- |
|
license: mit |
|
datasets: |
|
- wangkevin02/LMSYS-USP |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- allenai/longformer-base-4096 |
|
--- |
|
# AI Detect Model |
|
|
|
## Model Description |
|
|
|
> **GitHub repository** for exploring the source code and additional resources: https://github.com/wangkevin02/USP |
|
|
|
The **AI Detect Model** is a binary classification model designed to determine whether a given text is AI-generated (label=1) or written by a human (label=0). This model plays a crucial role in providing AI detection rewards, helping to prevent reward hacking during Reinforcement Learning with Cycle Consistency (RLCC). For more details, please refer to [our paper](https://arxiv.org/pdf/2502.18968). |
|
|
|
This model is built upon the [Longformer](https://huggingface.co/allenai/longformer-base-4096) architecture and trained using our proprietary [LMSYS-USP](https://huggingface.co/datasets/wangkevin02/LMSYS-USP) dataset. Specifically, in a dialogue context, texts generated by the assistant are labeled as AI-generated (label=1), while user-generated texts are assigned the opposite label (label=0). |
|
|
|
> *Note*: Our model is subject to the following constraints: |
|
> |
|
> 1. **Maximum Context Length**: Supports up to **4,096 tokens**. Exceeding this may degrade performance; keep inputs within this limit for best results. |
|
> 2. **Language Limitation**: Optimized for English. Non-English performance may vary due to limited training data. |
|
|
|
|
|
|
|
## Quick Start |
|
|
|
You can utilize our AI detection model as demonstrated below: |
|
|
|
```python |
|
from transformers import LongformerTokenizer, LongformerForSequenceClassification |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
class AIDetector: |
|
def __init__(self, model_name="allenai/longformer-base-4096", max_length=4096): |
|
""" |
|
Initialize the AIDetector with a pretrained Longformer model and tokenizer. |
|
|
|
Args: |
|
model_name (str): The name or path of the pretrained Longformer model. |
|
max_length (int): The maximum sequence length for tokenization. |
|
""" |
|
self.tokenizer = LongformerTokenizer.from_pretrained(model_name) |
|
self.model = LongformerForSequenceClassification.from_pretrained(model_name) |
|
self.model.eval() |
|
self.max_length = max_length |
|
self.tokenizer.padding_side = "right" |
|
|
|
@torch.no_grad() |
|
def get_probability(self, texts): |
|
inputs = self.tokenizer(texts, padding=True, truncation=True, max_length=self.max_length, return_tensors='pt') |
|
outputs = self.model(**inputs) |
|
probabilities = F.softmax(outputs.logits, dim=1) |
|
return probabilities |
|
|
|
# Example usage |
|
if __name__ == "__main__": |
|
classifier = AIDetector(model_name="/path/to/ai_detector") |
|
target_text = [ |
|
"I am thinking about going away for vacation", |
|
"How can I help you today?" |
|
] |
|
result = classifier.get_probability(target_text) |
|
print(result) |
|
# >>> Expected Output: |
|
# >>> tensor([[0.9954, 0.0046], |
|
# >>> [0.0265, 0.9735]]) |
|
``` |
|
|
|
|
|
|
|
## Citation |
|
|
|
If you find this model useful, please cite: |
|
|
|
```plaintext |
|
@misc{wang2025knowbettermodelinghumanlike, |
|
title={Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles}, |
|
author={Kuang Wang and Xianfei Li and Shenghao Yang and Li Zhou and Feng Jiang and Haizhou Li}, |
|
year={2025}, |
|
eprint={2502.18968}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2502.18968}, |
|
} |
|
``` |