File size: 1,757 Bytes
2c1575f e93a827 2c1575f 0ee90b1 2c1575f 0ee90b1 2c1575f 0ee90b1 2c1575f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# RoBERTa-base Korean
## λͺ¨λΈ μ€λͺ
μ΄ RoBERTa λͺ¨λΈμ λ€μν νκ΅μ΄ ν
μ€νΈ λ°μ΄ν°μ
μμ **μμ ** λ¨μλ‘ μ¬μ νμ΅λμμ΅λλ€.
μ체 ꡬμΆν νκ΅μ΄ μμ λ¨μ vocabμ μ¬μ©νμμ΅λλ€.
## μν€ν
μ²
- **λͺ¨λΈ μ ν**: RoBERTa
- **μν€ν
μ²**: RobertaForMaskedLM
- **λͺ¨λΈ ν¬κΈ°**: 256 hidden size, 8 hidden layers, 8 attention heads
- **max_position_embeddings**: 514
- **intermediate_size**: 2048
- **vocab_size**: 1428
## νμ΅ λ°μ΄ν°
μ¬μ©λ λ°μ΄ν°μ
μ λ€μκ³Ό κ°μ΅λλ€:
- **λͺ¨λμλ§λμΉ**: μ±ν
, κ²μν, μΌμλν, λ΄μ€, λ°©μ‘λλ³Έ, μ±
λ±
- **AIHUB**: SNS, μ νλΈ λκΈ, λμ λ¬Έμ₯
- **κΈ°ν**: λ무μν€, νκ΅μ΄ μν€νΌλμ
μ΄ ν©μ°λ λ°μ΄ν°λ μ½ 11GB μ
λλ€.
## νμ΅ μμΈ
- **BATCH_SIZE**: 112 (GPUλΉ)
- **ACCUMULATE**: 36
- **MAX_STEPS**: 12,500
- **Train Steps*Batch Szie**: **100M**
- **WARMUP_STEPS**: 2,400
- **μ΅μ ν**: AdamW, LR 1e-3, BETA (0.9, 0.98), eps 1e-6
- **νμ΅λ₯ κ°μ **: linear
- **μ¬μ©λ νλμ¨μ΄**: 2x RTX 8000 GPU
![Evaluation Loss Graph](https://cdn-uploads.huggingface.co/production/uploads/64a0fd6fd3149e05bc5260dd/-64jKdcJAavwgUREwaywe.png)
![Evaluation Accuracy Graph](https://cdn-uploads.huggingface.co/production/uploads/64a0fd6fd3149e05bc5260dd/LPq5M6S8LTwkFSCepD33S.png)
## μ¬μ© λ°©λ²
```python
from transformers import AutoModel, AutoTokenizer
# λͺ¨λΈκ³Ό ν ν¬λμ΄μ λΆλ¬μ€κΈ°
model = AutoModel.from_pretrained("your_model_name")
tokenizer = AutoTokenizer.from_pretrained("your_tokenizer_name")
# ν
μ€νΈλ₯Ό ν ν°μΌλ‘ λ³ννκ³ μμΈ‘ μν
inputs = tokenizer("μ¬κΈ°μ νκ΅μ΄ ν
μ€νΈ μ
λ ₯", return_tensors="pt")
outputs = model(**inputs)
|