Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,41 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
-
|
9 |
-
-
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# RoBERTa-base Korean
|
2 |
+
|
3 |
+
## λͺ¨λΈ μ€λͺ
|
4 |
+
μ΄ RoBERTa λͺ¨λΈμ λ€μν νκ΅μ΄ ν
μ€νΈ λ°μ΄ν°μ
μμ *μμ *λ¨μλ‘ μ¬μ νμ΅λμμ΅λλ€.
|
5 |
+
|
6 |
+
## μν€ν
μ²
|
7 |
+
- **λͺ¨λΈ μ ν**: RoBERTa
|
8 |
+
- **μν€ν
μ²**: RobertaForMaskedLM
|
9 |
+
- **λͺ¨λΈ ν¬κΈ°**: 256 hidden size, 8 hidden layers, 8 attention heads
|
10 |
+
- **max_position_embeddings**: 514
|
11 |
+
- **intermediate_size**: 2048
|
12 |
+
- **vocab_size**: 1428
|
13 |
+
|
14 |
+
## νμ΅ λ°μ΄ν°
|
15 |
+
μ¬μ©λ λ°μ΄ν°μ
μ λ€μκ³Ό κ°μ΅λλ€:
|
16 |
+
- **λͺ¨λμλ§λμΉ**: μ±ν
, κ²μν, μΌμλν, λ΄μ€, λ°©μ‘λλ³Έ, μ±
λ±
|
17 |
+
- **AIHUB**: SNS, μ νλΈ λκΈ, λμ λ¬Έμ₯
|
18 |
+
- **κΈ°ν**: λ무μν€, νκ΅μ΄ μν€νΌλμ
|
19 |
+
μ΄ ν©μ°λ λ°μ΄ν°λ μ½ 11GB μ
λλ€.
|
20 |
+
|
21 |
+
## νμ΅ μμΈ
|
22 |
+
- **BATCH_SIZE**: 54 (GPUλΉ)
|
23 |
+
- **ACCUMULATE**: 74
|
24 |
+
- **MAX_STEPS**: 12,500
|
25 |
+
- **Train Steps*Batch Szie**: 100M
|
26 |
+
- **WARMUP_STEPS**: 2,400
|
27 |
+
- **μ΅μ ν**: AdamW, LR 1e-3, BETA (0.9, 0.98), eps 1e-6
|
28 |
+
- **νμ΅λ₯ κ°μ **: linear
|
29 |
+
- **μ¬μ©λ νλμ¨μ΄**: 2x RTX 8000 GPU
|
30 |
+
|
31 |
+
## μ¬μ© λ°©λ²
|
32 |
+
```python
|
33 |
+
from transformers import AutoModel, AutoTokenizer
|
34 |
+
|
35 |
+
# λͺ¨λΈκ³Ό ν ν¬λμ΄μ λΆλ¬μ€κΈ°
|
36 |
+
model = AutoModel.from_pretrained("your_model_name")
|
37 |
+
tokenizer = AutoTokenizer.from_pretrained("your_tokenizer_name")
|
38 |
+
|
39 |
+
# ν
μ€νΈλ₯Ό ν ν°μΌλ‘ λ³ννκ³ μμΈ‘ μν
|
40 |
+
inputs = tokenizer("μ¬κΈ°μ νκ΅μ΄ ν
μ€νΈ μ
λ ₯", return_tensors="pt")
|
41 |
+
outputs = model(**inputs)
|