license: mit
datasets:
- OpenAssistant/oasst1
language:
- en
pipeline_tag: conversational
Ava small
Training Details
The fine-tuning process for this model involved several key parameters and settings:
- Base Model: GPT-2
- Dataset: Open Assistant's oasst1 dataset
- Learning Rate: 1e-3
- Epochs: 10
- Hardware: GPU P100
The model was trained on a GPU P100 to expedite the training process and take advantage of the hardware's parallel processing capabilities. The learning rate was set to 1e-3 to balance the trade-off between fast convergence and avoiding overshooting.
Model Performance
After 10 epochs of training, the model achieved improved performance in generating coherent and contextually relevant responses in conversations. However, it's important to note that the model's responses might still exhibit occasional inaccuracies or inconsistencies.
Custom Tokens and Contextualization
To facilitate structured conversations and improve response generation, the following custom tokens were added:
<startoftext>
: Marks the beginning of a conversation prompt.<endoftext>
: Marks the end of a conversation prompt.<ava>
: Denotes the beginning of responses generated by the AI assistant.</ava>
: Denotes the end of AI-generated responses.<user>
: Denotes the beginning of user input in the conversation.</user>
: Denotes the end of user input.
Here is example of prompting:
<startoftext><user>Hello</user><ava>Hello there, How can i assist you today?</ava></endoftext>
Use Cases and Applications
Given its training on dialogues and conversations, this fine-tuned model is particularly well-suited for the following use cases:
- Dynamic and engaging conversations with users in chatbots or virtual assistants.
- Providing personalized information and assistance across diverse domains.
- Generating contextually relevant and creative responses to user inputs.
- Enhancing the user experience and interaction quality.
Inference script
from transformers import GPT2LMHeadModel, GPT2Tokenizer
def inference(text, model, tokenizer):
data = tokenizer.encode(f'<startoftext><user>{text}</user><ava>', return_tensors='pt')
input_ids = data.to(device)
output = model.generate(
input_ids=input_ids,
temperature=0.8,
max_length=100,
top_k=50,
top_p=0.95,
repetition_penalty=1.2,
num_return_sequences=1,
)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
ava_response = decoded_output.split('<ava>')[1].split('</ava>')[0]
clean_response = ava_response.split('.')[0].strip()
return clean_response
model_name = 'Kuduxaaa/ava-small'
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)
user_input = "What's the weather like today?"
response = inference(user_input, model, tokenizer)
print('Ava: ', response)