metadata
license: mit
language:
- en
tags:
- Kolmogorov-Arnold Network
- Bert
- KAN
BerKANT (training)
A Bert implementation where most of the torch.nn.linear
have been replaced with KANLinear
.
Currently pretraining on JackBAI/bert_pretrain_datasets on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :)
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_path = 'isemmanuelolowe/BerKANT_171M'
model = AutoModelForSequenceClassification.from_pretrained(model_path, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_path)