Edit model card

GPT-J-Title-Teaser-1k

gptj-title-teaser-1k
Version 1.0 / 22 December 2022

A proof of concept for multitask fine-tuning GPT-J-6B-8bit for german news title and teaser generation.

Model Details

Model Description

  • Developed by: snipaid
  • Model type: gptj
  • Language(s) (NLP): de
  • License: MIT
  • Finetuned from model: GPT-J-6B-8bit

Uses

This model is not intended for use! It is a preliminary version of gptj-title-teaser-10k to prove the multitask fine-tuning approach.
For use please refer to gptj-title-teaser-10k.

Training Details

Training Data

The model was finetuned on a collection of 1,000 news items scraped from different online news outlets in german language.

For each news item the dataset contains title, teaser and fulltext.

[
 {
    "title": ...,
    "teaser": ...,
    "fulltext": ...
  },
]

Training Procedure

The model was finetuned using a causal language modeling (CLM) objective for multitask finetuning.

Preprocessing

For each news item, two inputs were concatenated like below.

f"[Text]: {item.fulltext} \n [Title]: {item.title}"
f"[Text]: {item.fulltext} \n [Teaser]: {item.teaser}"

This results in one input per task for each news item.

Note: The inserted prompt "[Text]:" marks the beginning of the news item's fulltext.
In the same manner "[Title]:" prompts the news item's title and "[Teaser]:" the news item's teaser.

Evaluation

1,000 german news articles proved to be sufficient to validate the approach. Evaluation showed that the model improved compared to the GPT-J baseline in:

  • german language capabilities (significantly)
  • title generation (significantly)
  • teaser generation (slightly)

The evaluation also suggested that there is still opportunity for improvement with more data.
For the model trained with the same approach but 10x the amount of data pleaser refer to gptj-title-teaser-10k.

Environmental Impact

Carbon emissions were estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: A100 SXM4
  • Hours used: 2h 42min
  • Cloud Provider: Vast.ai
  • Compute Region: Unknown
  • Carbon Emitted: ~0.47kg co2e

Glossary

News Item, aka news article. A particular piece of news, usually from a journalistic source.
Snippet, a small section of text that is related to a news item.
Title aka headline. A few words that reflect the essence of the news story.
Teaser aka lede. A few sentences that spark curiosity about the "best of the rest" of the news story.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Inference API (serverless) has been turned off for this model.