Edit model card

Detects if ChatGPT generated the answer given a prompt and answer pair (QA).

DeBERTa-v3-Large model fine-tuned on GPT Wiki intro and HC3 dataset.

We used a few tricks during training to optimize classification results.

Fine-tuning was done for 1 epoch with 8:2 train/validation split. Recorded a validation accuracy of 99% for both GPT Wiki intro and HC3 after the 1st epoch.

This model recorded an accuracy of 93.467% on the public leaderboard and 94.00% on the private leaderboard of Kaggle's Detect ChatGPT Answers Contest

Example Usage

from transformers import * 
import numpy as np 
import pandas as pd 
import os 
import torch 
import torch.nn as nn

model = AutoModelForSequenceClassification.from_pretrained("LDKSolutions/chatgpt-qa-detector-deberta-v3-large", num_labels=2) 
tokenizer = AutoTokenizer.from_pretrained("LDKSolutions/chatgpt-qa-detector-deberta-v3-large")

model.eval() 

prompt = "What do you think about Bitcoin?"
explanation = ''' As an AI language model, I do not have personal opinions or emotions. However, I can provide information and insights about Bitcoin.

Bitcoin is a decentralized digital currency that operates on a peer-to-peer network, without the need for intermediaries such as banks or financial institutions. It was created in 2009 by an unknown person or group using the pseudonym Satoshi Nakamoto.

Bitcoin has gained popularity as a means of exchange and a store of value, with some proponents seeing it as a potential alternative to traditional fiat currencies. However, it has also been subject to criticism due to its perceived lack of regulation and potential use in illegal activities.

From a technical standpoint, Bitcoin is based on blockchain technology, a distributed ledger that records all transactions made on the network. The blockchain ensures that transactions are secure and transparent, while also allowing for anonymity.

Like any investment or asset, Bitcoin has its risks and potential rewards. Its value can be volatile and subject to market fluctuations, and there is no guarantee of returns. It is important to do thorough research and seek professional advice before making any investment decisions.
'''

encoded_input = tokenizer(prompt, explanation, max_length=512, truncation=True, padding="max_length", return_tensors="pt") 

with torch.no_grad(): 
    output = model(**encoded_input).logits 
    probs = nn.Softmax(dim=1)(output) 
    predicted_class = torch.argmax(probs).item()
    if predicted_class == 1: 
        print("Likely to be generated by ChatGPT!") 
    else: 
        print("Likely to be generated by Human!") 
Downloads last month
14
Inference API
Unable to determine this model’s pipeline type. Check the docs .