Edit model card

Model Overview

This is a fine-tune of the FLAN-T5 model from Google. This was trained on the "samsum" dataset in order to summarise chat logs. There are other models sizes available in this same series:

As of writing, there are no larger models planned for this series, with this model being the current best one available in our testing.

Intended Use

The model is intended to be used for generating summaries of chat logs. It can be employed in a wide range of applications, including but not limited to chat analysis, conversation summarization, and dialogue-based content generation.

Training Data

The model has been fine-tuned on the samsum dataset, which contains conversations between two or more participants. The dataset is in English, and each conversation is associated with a summary that captures the main points of the discussion.

Limitations and Ethical Considerations

As with any language model, the FLAN-T5 model has certain limitations and potential ethical considerations:

  1. Limited Context Understanding: The model's performance heavily relies on the context provided in the chat logs. It may not fully understand the nuances of the conversation, leading to occasional inaccuracies in the generated summaries.

  2. Biases in Training Data: The model's fine-tuning data (samsum dataset) may contain biases present in the original data source. This could lead to biased or unfair summaries being generated.

  3. Privacy and Data Security: If the chat logs used for summarization contain sensitive or private information, using this model may pose privacy risks, and proper data anonymization measures should be taken.

  4. Responsibility in Use: The model should be used responsibly, and the generated summaries should be carefully analyzed before making any critical decisions based on them.

Validation Metrics

  • Loss: 1.218
  • Rouge1: 49.316
  • Rouge2: 26.518
  • RougeL: 42.229
  • RougeLsum: 45.716
  • Gen Len: 16.799

Carbon Emissions

  • CO2 Emissions (in grams): 0.1659
Downloads last month
51
Safetensors
Model size
783M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train KoalaAI/ChatSum-Large