File size: 1,464 Bytes
adb10ae |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
---
language: ko
---
# KoBigBird
Pretrained BigBird Model for Korean (**kobigbird-bert-base**)
## About
BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences.
BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
Model is warm started from Korean BERT’s checkpoint.
## How to use
WARN: Please use `BertTokenizer` instead of `BigBirdTokenizer`.
```python
from transformers import AutoModel, AutoTokenizer
# by default its in `block_sparse` mode with num_random_blocks=3, block_size=64
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base")
# you can change `attention_type` to full attention like this:
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", attention_type="original_full")
# you can change `block_size` & `num_random_blocks` like this:
model = AutoModel.from_pretrained("monologg/kobigbird-bert-base", block_size=16, num_random_blocks=2)
tokenizer = AutoTokenizer.from_pretrained("monologg/kobigbird-bert-base")
text = "한국어 BigBird 모델을 공개합니다!"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
```
|