|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- wmt/wmt14 |
|
language: |
|
- de |
|
- en |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
<p align="center"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/P-TlY6ia0gLJeJxBA_04g.gif" /> |
|
</p> |
|
<hr> |
|
|
|
This is a custom huggingface model port of the [PyTorch implementation of the original transformer](https://github.com/ubaada/scratch-transformer) model from 2017 introduced in the paper "[Attention Is All You Need](https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf)". This is the 65M parameter base model version trained to do English-to-German translations. |
|
|
|
## Usage: |
|
```python |
|
model = AutoModel.from_pretrained("ubaada/original-transformer", trust_remote_code=True) |
|
tokenizer = AutoTokenizer.from_pretrained("ubaada/original-transformer") |
|
text = 'This is my cat' |
|
output = model.generate(**tokenizer(text, return_tensors="pt", add_special_tokens=True, truncation=True, max_length=100)) |
|
tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True) |
|
# Output: ' Das ist meine Katze.' |
|
``` |
|
(remember the `trust_remote_code=True` because of custom modeling file) |
|
## Training: |
|
| Parameter | Value | |
|
|----------------------|-------------------------------------------------------------------------------------------------| |
|
| Dataset | WMT14-de-en | |
|
| Translation Pairs | 4.5M (135M tokens total) | |
|
| Epochs | 24 | |
|
| Batch Size | 16 | |
|
| Accumulation Batch | 8 | |
|
| Effective Batch Size | 128 (16 * 8) | |
|
| Training Script | [train.py](https://github.com/ubaada/scratch-transformer/blob/main/train.py) | |
|
| Optimiser | Adam (learning rate = 0.0001) | |
|
| Loss Type | Cross Entropy | |
|
| Final Test Loss | 1.87 | |
|
| GPU. | RTX 4070 (12GB) | |
|
|
|
<p align="center" style="width:500px;max-width:100%;"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/0p4eEHiYFaeaibjk_Rf1y.png" /> |
|
</p> |
|
|
|
|
|
## Results |
|
|
|
<p align="center" style="width:500px;max-width:100%;"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/62a7d1e152aa8695f9209345/Gip1Ox-M1_z3qdafGGh3-.png" /> |
|
</p> |
|
|
|
|
|
|