Text2Text Generation
GGUF
5 languages
flan-t5-xl-gguf / README.md
bayang's picture
Update README.md
a2774c1
metadata
language:
  - en
  - fr
  - ro
  - de
  - multilingual
widget:
  - text: 'Translate to German:  My name is Arthur'
    example_title: Translation
  - text: >-
      Please answer to the following question. Who is going to be the next
      Ballon d'or?
    example_title: Question Answering
  - text: >-
      Q: Can Geoffrey Hinton have a conversation with George Washington? Give
      the rationale before answering.
    example_title: Logical reasoning
  - text: >-
      Please answer the following question. What is the boiling point of
      Nitrogen?
    example_title: Scientific knowledge
  - text: >-
      Answer the following yes/no question. Can you write a whole Haiku in a
      single tweet?
    example_title: Yes/no question
  - text: >-
      Answer the following yes/no question by reasoning step-by-step. Can you
      write a whole Haiku in a single tweet?
    example_title: Reasoning task
  - text: 'Q: ( False or not False or False ) is? A: Let''s think step by step'
    example_title: Boolean Expressions
  - text: >-
      The square root of x is the cube root of y. What is y to the power of 2,
      if x = 4?
    example_title: Math reasoning
  - text: >-
      Premise:  At my age you will probably have learnt one lesson. Hypothesis: 
      It's not certain how many lessons you'll learn by your thirties. Does the
      premise entail the hypothesis?
    example_title: Premise and hypothesis
tags:
  - text2text-generation
datasets:
  - svakulenk0/qrecc
  - taskmaster2
  - djaym7/wiki_dialog
  - deepmind/code_contests
  - lambada
  - gsm8k
  - aqua_rat
  - esnli
  - quasc
  - qed
license: apache-2.0

Model Card for FLAN-T5 XL

drawing

Table of Contents

  1. TL;DR
  2. Model Details
  3. Usage
  4. Uses
  5. Bias, Risks, and Limitations
  6. Training Details
  7. Evaluation
  8. Environmental Impact
  9. Citation

TL;DR

If you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract :

Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card.

Model Details

Model Description

The details are in the original google/flan-t5-xl