pko-t5-large

Source Code

pko-t5 λŠ” ν•œκ΅­μ–΄ μ „μš© λ°μ΄ν„°λ‘œ ν•™μŠ΅ν•œ t5 v1.1 λͺ¨λΈμž…λ‹ˆλ‹€.

ν•œκ΅­μ–΄λ₯Ό tokenize ν•˜κΈ° μœ„ν•΄μ„œ sentencepiece λŒ€μ‹  OOV κ°€ μ—†λŠ” BBPE λ₯Ό μ‚¬μš©ν–ˆμœΌλ©° ν•œκ΅­μ–΄ 데이터 (λ‚˜λ¬΄μœ„ν‚€, μœ„ν‚€ν”Όλ””μ•„, λͺ¨λ‘μ˜λ§λ­‰μΉ˜ λ“±..) λ₯Ό T5 의 span corruption task λ₯Ό μ‚¬μš©ν•΄μ„œ unsupervised learning 만 μ μš©ν•˜μ—¬ ν•™μŠ΅μ„ μ§„ν–‰ν–ˆμŠ΅λ‹ˆλ‹€.

pko-t5 λ₯Ό μ‚¬μš©ν•˜μ‹€ λ•ŒλŠ” λŒ€μƒ task 에 νŒŒμΈνŠœλ‹ν•˜μ—¬ μ‚¬μš©ν•˜μ‹œκΈ° λ°”λžλ‹ˆλ‹€.

Usage

transformers 의 API λ₯Ό μ‚¬μš©ν•˜μ—¬ μ ‘κ·Ό κ°€λŠ₯ν•©λ‹ˆλ‹€. tokenizer λ₯Ό μ‚¬μš©ν• λ•ŒλŠ” T5Tokenizer κ°€ μ•„λ‹ˆλΌ T5TokenizerFast λ₯Ό μ‚¬μš©ν•΄μ£Όμ‹­μ‹œμ˜€. model 은 T5ForConditionalGeneration λ₯Ό κ·ΈλŒ€λ‘œ ν™œμš©ν•˜μ‹œλ©΄ λ©λ‹ˆλ‹€.

Example

from transformers import T5TokenizerFast, T5ForConditionalGeneration

tokenizer = T5TokenizerFast.from_pretrained('paust/pko-t5-large')
model = T5ForConditionalGeneration.from_pretrained('paust/pko-t5-large')

input_ids = tokenizer(["qa question: λ‹Ήμ‹ μ˜ 이름은 λ¬΄μ—‡μΈκ°€μš”?"]).input_ids
labels = tokenizer(["T5 μž…λ‹ˆλ‹€."]).input_ids
outputs = model(input_ids=input_ids, labels=labels)

print(f"loss={outputs.loss} logits={outputs.logits}")

Klue 평가 (dev)

Model ynat (macro F1) sts (pearsonr/F1) nli (acc) ner (entity-level F1) re (micro F1) dp (LAS) mrc (EM/F1)
Baseline 87.30 93.20/86.13 89.50 86.06 71.06 87.93 75.26/-
FT pko-t5-small (77M) 86.21 77.99/77.01 69.20 82.60 66.46 93.15 43.81/46.58
FT pko-t5-base (250M) 87.29 90.25/83.43 79.73 87.80 67.23 97.28 61.53/64.74
FT pko-t5-large (800M) 87.12 92.05/85.24 84.96 88.18 75.17 97.60 68.01/71.44
MT pko-t5-small 84.54 68.50/72/02 51.16 74.69 66.11 80.40 43.60/46.28
MT pko-t5-base 86.89 83.96/80.30 72.03 85.27 66.59 95.05 61.11/63.94
MT pko-t5-large 87.57 91.93/86.29 83.63 87.41 71.34 96.99 70.70/73.72
  • FT: μ‹±κΈ€νƒœμŠ€ν¬ νŒŒμΈνŠœλ‹ / MT: λ©€ν‹°νƒœμŠ€ν¬ νŒŒμΈνŠœλ‹
  • Baseline: KLUE λ…Όλ¬Έμ—μ„œ μ†Œκ°œλœ dev set 에 λŒ€ν•œ SOTA 점수

License

PAUSTμ—μ„œ λ§Œλ“  pko-t5λŠ” MIT license ν•˜μ— κ³΅κ°œλ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.

Downloads last month
102
Safetensors
Model size
820M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for paust/pko-t5-large

Finetunes
19 models

Space using paust/pko-t5-large 1