GPT2 Catalan small model Version 2 (Uncased)
Prerequisites
transformers==4.19.2
Model architecture
This model uses GPT2 base model settings, but the size of embedding dimensions are half the size of them.
Tokenizer
Using BPE tokenizer with vocabulary size 50,000.
Training Data
- wiki40b/ca (Catalan Wikipedia)
- Subset of oscar
- Subset of CC-100/ca : Monolingual Datasets from Web Crawl Data
Usage
from transformers import pipeline
unmasker = pipeline('fill-mask', model='ClassCat/gpt2-small-catalan-v2')
unmasker("Ell està una mica")
- Downloads last month
- 103
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.