Transformers
PyTorch
Safetensors
conversational
Inference Endpoints

Chinese pre-trained dialogue model (CDial-GPT)

This project provides a large-scale Chinese GPT model pre-trained on the dataset LCCC.

We present a series of Chinese GPT model that are first pre-trained on a Chinese novel dataset and then post-trained on our LCCC dataset.

Similar to TransferTransfo, we concatenate all dialogue histories into one context sentence, and use this sentence to predict the response. The input of our model consists of word embedding, speaker embedding, and positional embedding of each word.

Paper: A Large-Scale Chinese Short-Text Conversation Dataset

How to use

from transformers import OpenAIGPTLMHeadModel, GPT2LMHeadModel, BertTokenizer
import torch


tokenizer = BertTokenizer.from_pretrained("thu-coai/CDial-GPT2_LCCC-base")
model = GPT2LMHeadModel.from_pretrained("thu-coai/CDial-GPT2_LCCC-base")

For more details, please refer to our repo. on github.

Downloads last month
122
Safetensors
Model size
98.6M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train thu-coai/CDial-GPT2_LCCC-base