license: mit | |
language: | |
- en | |
tags: | |
- Kolmogorov-Arnold Network | |
- Bert | |
- KAN | |
# BerKANT (training) | |
A Bert implementation where most of the `torch.nn.linear` have been replaced with `KANLinear`. | |
Currently pretraining on [JackBAI/bert_pretrain_datasets](https://huggingface.co/datasets/JackBAI/bert_pretrain_datasets) on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :) |