Update README.md
Browse files
README.md
CHANGED
@@ -47,7 +47,7 @@ pipeline_tag: text-generation
|
|
47 |
|
48 |
AI ์ ๋น
๋ฐ์ดํฐ ๋ถ์ ์ ๋ฌธ ๊ธฐ์
์ธ Linkbricks์ ๋ฐ์ดํฐ์ฌ์ด์ธํฐ์คํธ์ธ ์ง์ค์ฑ(Saxo) ์ด์ฌ๊ฐ <br>
|
49 |
meta-llama/Llama-3.2-3B-Instruct ๋ฒ ์ด์ค๋ชจ๋ธ์ ์ฌ์ฉํด์ H100-80G 8๊ฐ๋ฅผ ํตํด CPT(Continued Pre Trainig) ํ ํ๊ธ ์ธ์ด ๋ชจ๋ธ<br>
|
50 |
-
5์ฒ๋ง๊ฑด์ ํ๊ธ ๋ด์ค ํฌํจ ๋ค์ํ ํ๊ธ ์ฝํผ์ค๋ฅผ ๊ธฐ์ค์ผ๋ก ์ ์ฒด ํ๋ผ๋ฏธํฐ์ค ์ฝ
|
51 |
-ํ ํฌ๋์ด์ ๋ ํ์ฅ ์์ด ๋ฒ ์ด์ค ๋ชจ๋ธ์ ๊ทธ๋๋ก ์ฌ์ฉ<br>
|
52 |
-128k-Context Window<br>
|
53 |
-ํ๊ธ Function Call ๋ฐ Tool Calling ์ง์ <br>
|
@@ -56,7 +56,7 @@ meta-llama/Llama-3.2-3B-Instruct ๋ฒ ์ด์ค๋ชจ๋ธ์ ์ฌ์ฉํด์ H100-80G 8๊ฐ
|
|
56 |
|
57 |
Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics <br>
|
58 |
Korean language model CPT (Continued Pre Trainig) with 8 H100-80Gs using meta-llama/Llama-3.2-3B-Instruct base model<br>
|
59 |
-
A basic Korean language model with about
|
60 |
<br><br>
|
61 |
-Tokenizer uses the base model without word expansion<br>
|
62 |
-128k-Context Window<br>
|
|
|
47 |
|
48 |
AI ์ ๋น
๋ฐ์ดํฐ ๋ถ์ ์ ๋ฌธ ๊ธฐ์
์ธ Linkbricks์ ๋ฐ์ดํฐ์ฌ์ด์ธํฐ์คํธ์ธ ์ง์ค์ฑ(Saxo) ์ด์ฌ๊ฐ <br>
|
49 |
meta-llama/Llama-3.2-3B-Instruct ๋ฒ ์ด์ค๋ชจ๋ธ์ ์ฌ์ฉํด์ H100-80G 8๊ฐ๋ฅผ ํตํด CPT(Continued Pre Trainig) ํ ํ๊ธ ์ธ์ด ๋ชจ๋ธ<br>
|
50 |
+
5์ฒ๋ง๊ฑด์ ํ๊ธ ๋ด์ค ํฌํจ ๋ค์ํ ํ๊ธ ์ฝํผ์ค๋ฅผ ๊ธฐ์ค์ผ๋ก ์ ์ฒด ํ๋ผ๋ฏธํฐ์ค ์ฝ 35%๋ฅผ ์ฌ ํ๋ํ ํ๊ธ ๊ธฐ๋ณธ ๋ชจ๋ธ๋ก SFT, DPO ๋ฅผ ํตํด ์ฉ๋์ ๋ง๊ฒ ํ๋ํ๋ฉด ๋ฉ๋๋ค.<br>
|
51 |
-ํ ํฌ๋์ด์ ๋ ํ์ฅ ์์ด ๋ฒ ์ด์ค ๋ชจ๋ธ์ ๊ทธ๋๋ก ์ฌ์ฉ<br>
|
52 |
-128k-Context Window<br>
|
53 |
-ํ๊ธ Function Call ๋ฐ Tool Calling ์ง์ <br>
|
|
|
56 |
|
57 |
Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics <br>
|
58 |
Korean language model CPT (Continued Pre Trainig) with 8 H100-80Gs using meta-llama/Llama-3.2-3B-Instruct base model<br>
|
59 |
+
A basic Korean language model with about 35% of the total parameters re-tuned based on various Korean corpus including 50 million Korean news, which need to be customized through SFT and DPO.
|
60 |
<br><br>
|
61 |
-Tokenizer uses the base model without word expansion<br>
|
62 |
-128k-Context Window<br>
|