README.md · eaglewatch/gpt2-ko-wikipedia at main

metadata

language:
  - ko
thumbnail: >-
  https://aeiljuispo.cloudimg.io/v7/https://s3.amazonaws.com/moonup/production/uploads/1657076373819-noauth.jpeg?w=200&h=200&f=face
tags:
  - korean
  - gpt2
license: apache-2.0
datasets:
  - eaglewatch/korean_wikipedia_dataset_for_GPT2

This is a GPT-2 based model that has been trained with Korean Wikipedia dataset.

Since there is no Korean pre-trained model that has been trained with a large dataset like Wikipedia for GPT-2 yet, so I made a decision to train GPT-2 for Korean texts.

It has been trained with Korean Wikipedia dataset (train wikipedia article count: 334420, validation wikipedia article count: 83605).

Yongwoo Jeong, Sep 13th, 2022.