|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- bertin-project/filiberto-124M |
|
library_name: transformers |
|
language: |
|
- es |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
**Filiberto 124M** is a small specialized foundation model trained on Spanish Golden Age Dramas. |
|
|
|
Filiberto 124M OCR is only 124 million parameters. It can run easily on CPU or provide correction at scale on GPUs (>10k tokens/seconds). |
|
|
|
## Training |
|
The pre-training material included a collection of works taken from the [TEXORO](https://etso.es/texoro) corpus, via a collaboration with [ETSO](https://etso.es/), totalling ~5 million tokens. |
|
|
|
Pre-training ran on 5 epochs with levanter (500 steps total, each processing 1024 sequences of 512 tokens) on a TPUv4-32 for 15 minutes. |
|
|
|
Tokenization is currently done with the GPT-2 tokenizer. |
|
|