Certainly! Here's a short README for using the pre-trained distilgpt2 model for chatting:


DistilGPT-2 Chatbot

This project demonstrates how to use the pre-trained distilgpt2 model from Hugging Face for creating a simple chatbot. It includes code for loading the model, generating responses, and running an interactive conversation loop.

Prerequisites

Ensure you have the following libraries installed:

pip install transformers torch

Usage

  1. Load the Pre-trained Model and Tokenizer

    from transformers import GPT2LMHeadModel, GPT2Tokenizer
    
    model_name = "distilgpt2"
    model = GPT2LMHeadModel.from_pretrained(model_name)
    tokenizer = GPT2Tokenizer.from_pretrained(model_name)
    
  2. Generate a Response

    Use the following function to generate a response based on user input:

    def generate_response(prompt, max_length=100):
        input_ids = tokenizer.encode(prompt, return_tensors='pt')
        output = model.generate(
            input_ids,
            max_length=max_length,
            pad_token_id=tokenizer.eos_token_id,
            no_repeat_ngram_size=2,
            num_return_sequences=1,
            temperature=0.7,
            top_p=0.9,
            top_k=50
        )
        response = tokenizer.decode(output[0], skip_special_tokens=True)
        return response
    
  3. Interactive Conversation Loop

    Run the following code to start a chat session:

    while True:
        user_input = input("You: ")
        prompt = f"<user> {user_input}<AI>"
        response = generate_response(prompt)
        print(f"AI: {response}")
    
        if user_input.lower() in ["exit", "quit"]:
            break
    

Configuration

  • Temperature: Controls randomness. Lower values are more deterministic.
  • Top-p and top-k: Limit word selection for balanced diversity and coherence.
  • Max_length: Limits the length of the response.
Downloads last month
5
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support torch, transformers models for this pipeline type.

Dataset used to train arcsu1/DistilGPT2xDialogsum