KPMGhyesukim commited on
Commit
8706234
ยท
1 Parent(s): 1a6d7e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -23,11 +23,13 @@ tags:
23
  - mdeberta
24
  license: mit
25
  ---
26
- > ๐Ÿ’ก ์•„๋ž˜ ํ”„๋กœ์ ํŠธ๋Š”ย KPMG Lighthouse Korea์—์„œ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
27
- > KPMG Lighthouse Korea์—์„œ๋Š”, Financial area์˜ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Edge Technology์˜ NLP/Vision AI๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
28
 
29
  # mDeBERTa-v3-base-kor-further
30
 
 
 
 
 
31
  ## What is DeBERTa?
32
  - [DeBERTa](https://arxiv.org/abs/2006.03654)๋Š” `Disentangled Attention` + `Enhanced Mask Decoder` ๋ฅผ ์ ์šฉํ•˜์—ฌ ๋‹จ์–ด์˜ positional information์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด์™€ ๊ฐ™์€ ์•„์ด๋””์–ด๋ฅผ ํ†ตํ•ด, ๊ธฐ์กด์˜ BERT, RoBERTa์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ absolute position embedding๊ณผ๋Š” ๋‹ฌ๋ฆฌ DeBERTa๋Š” ๋‹จ์–ด์˜ ์ƒ๋Œ€์ ์ธ ์œ„์น˜ ์ •๋ณด๋ฅผ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ, BERT, RoBERTA ์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ ๋” ์ค€์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.
33
  - [DeBERTa-v3](https://arxiv.org/abs/2111.09543)์—์„œ๋Š”, ์ด์ „ ๋ฒ„์ „์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ MLM (Masked Language Model) ์„ RTD (Replaced Token Detection) Task ๋กœ ๋Œ€์ฒดํ•œ ELECTRA ์Šคํƒ€์ผ์˜ ์‚ฌ์ „ํ•™์Šต ๋ฐฉ๋ฒ•๊ณผ, Gradient-Disentangled Embedding Sharing ์„ ์ ์šฉํ•˜์—ฌ ๋ชจ๋ธ ํ•™์Šต์˜ ํšจ์œจ์„ฑ์„ ๊ฐœ์„ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
 
23
  - mdeberta
24
  license: mit
25
  ---
 
 
26
 
27
  # mDeBERTa-v3-base-kor-further
28
 
29
+ > ๐Ÿ’ก ์•„๋ž˜ ํ”„๋กœ์ ํŠธ๋Š”ย KPMG Lighthouse Korea์—์„œ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
30
+ > KPMG Lighthouse Korea์—์„œ๋Š”, Financial area์˜ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Edge Technology์˜ NLP/Vision AI๋ฅผ ๋ชจ๋ธ๋งํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
31
+
32
+
33
  ## What is DeBERTa?
34
  - [DeBERTa](https://arxiv.org/abs/2006.03654)๋Š” `Disentangled Attention` + `Enhanced Mask Decoder` ๋ฅผ ์ ์šฉํ•˜์—ฌ ๋‹จ์–ด์˜ positional information์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด์™€ ๊ฐ™์€ ์•„์ด๋””์–ด๋ฅผ ํ†ตํ•ด, ๊ธฐ์กด์˜ BERT, RoBERTa์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ absolute position embedding๊ณผ๋Š” ๋‹ฌ๋ฆฌ DeBERTa๋Š” ๋‹จ์–ด์˜ ์ƒ๋Œ€์ ์ธ ์œ„์น˜ ์ •๋ณด๋ฅผ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ, BERT, RoBERTA ์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ ๋” ์ค€์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.
35
  - [DeBERTa-v3](https://arxiv.org/abs/2111.09543)์—์„œ๋Š”, ์ด์ „ ๋ฒ„์ „์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ MLM (Masked Language Model) ์„ RTD (Replaced Token Detection) Task ๋กœ ๋Œ€์ฒดํ•œ ELECTRA ์Šคํƒ€์ผ์˜ ์‚ฌ์ „ํ•™์Šต ๋ฐฉ๋ฒ•๊ณผ, Gradient-Disentangled Embedding Sharing ์„ ์ ์šฉํ•˜์—ฌ ๋ชจ๋ธ ํ•™์Šต์˜ ํšจ์œจ์„ฑ์„ ๊ฐœ์„ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.