BootsofLagrangian commited on
Commit
a1a4d4b
โ€ข
1 Parent(s): 9db0880

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +261 -3
README.md CHANGED
@@ -1,3 +1,261 @@
1
- ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ libray_name: transformers
3
+ pipeline_tag: text-generation
4
+ license: other
5
+ license_name: llama3
6
+ license_link: LICENSE
7
+ language:
8
+ - ko
9
+ - en
10
+ tags:
11
+ - meta
12
+ - llama
13
+ - llama-3
14
+ - akallama
15
+ library_name: transformers
16
+
17
+ inference: false
18
+ ---
19
+ <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
20
+ <img src="https://github.com/0110tpwls/project/blob/master/image_720.png?raw=true" width="40%"/>
21
+ </a>
22
+
23
+
24
+ # AKALLAMA
25
+
26
+ AkaLlama is a series of Korean language models designed for practical usability across a wide range of tasks.
27
+ The initial model, AkaLlama-v0.1, is a fine-tuned version of Meta-Llama-3-70b-Instruct. It has been trained on a custom mix of publicly available datasets curated by the MIR Lab.
28
+ Our goal is to explore cost-effective ways to adapt high-performing LLMs for specific use cases, such as different languages (e.g., Korean) or domains (e.g., organization-specific chatbots).
29
+
30
+ For details, check out [our project page](https://yonsei-mir.github.io/AkaLLaMA-page).
31
+
32
+ ### Model Description
33
+
34
+ This repo provides ExLlamav2 weight files for AkaLlama-70B-v0.1.
35
+
36
+ - **Developed by:** [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
37
+ - **Language(s) (NLP):** Korean, English
38
+ - **License:** llama3
39
+ - **Finetuned from model:** [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
40
+ - **Quantized from model:** [mirlab/AkaLlama-llama3-70b-v0.1](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1)
41
+ - **Dataset for Quantization:** [maywell/korean_textbooks/ko_wikidata](https://huggingface.co/datasets/maywell/korean_textbooks/tree/main/ko_wikidata)
42
+
43
+ Main branch of this repo only contains README.md. Please uses specific branch.
44
+
45
+ ### Available Branch
46
+ | Name | Head Bits | avg Bits | Size | Max RAM required |
47
+ | ---- | ---- | ---- | ---- | ---- |
48
+ | [AkaLlama-llama3-70b-v0.1.2bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2bpwh8) | 8 | 2 | 21 GB| 23.5 GB |
49
+ | [AkaLlama-llama3-70b-v0.1.2.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.25bpwh8) | 8 | 2.25 | 21 GB| 23.5 GB |
50
+ | [AkaLlama-llama3-70b-v0.1.2.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.4bpwh8) | 8 | 2.4 | 23 GB| 25.5 GB |
51
+ | [AkaLlama-llama3-70b-v0.1.3bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/3bpwh8) | 8 | 3 | 27 GB| 29.5 GB |
52
+ | [AkaLlama-llama3-70b-v0.1.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4bpwh8) | 8 | 4 | 35 GB| 37.5 GB |
53
+ | [AkaLlama-llama3-70b-v0.1.4.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4.25bpwh8) | 8 | 4.25 | 37 GB| 39.35 GB |
54
+ | [AkaLlama-llama3-70b-v0.1.5bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/5bpwh8) | 8 | 5 | 43 GB| 45.5 GB |
55
+ | [AkaLlama-llama3-70b-v0.1.6bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/6bpwh8) | 8 | 6 | 51 GB| 53.5 GB |
56
+ | [AkaLlama-llama3-70b-v0.1.8bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/8bpwh8) | 8 | 8 | 59 GB| 61.5 GB |
57
+
58
+ **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
59
+
60
+
61
+ ### About ExLlamaV2
62
+
63
+ ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
64
+
65
+ Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.
66
+
67
+ ### Use with ExLlamaV2 Third Party
68
+
69
+ They are also compatible with many third party UIs and libraries.
70
+
71
+ - [TabbyAPI](https://github.com/theroyallab/tabbyAPI/) is a FastAPI-based server that provides an OpenAI-style web API
72
+ compatible with [SillyTavern](https://sillytavernai.com/) and other frontends.
73
+
74
+ - [ExUI](https://github.com/turboderp/exui) is a simple, standalone single-user web UI that serves an ExLlamaV2 instance
75
+ directly with chat and notebook modes.
76
+
77
+ - [text-generation-webui](https://github.com/oobabooga/text-generation-webui) supports ExLlamaV2 through the **exllamav2**
78
+ and **exllamav2_HF** loaders.
79
+
80
+ - [lollms-webui](https://github.com/ParisNeo/lollms-webui) supports ExLlamaV2 through the exllamav2 binding.
81
+
82
+
83
+ ## Training Details
84
+ ### Training Procedure
85
+
86
+ We trained AkaLlama using a preference learning alignment algorithm called [Odds Ratio Preference Optimization (ORPO)](https://huggingface.co/papers/2403.07691).
87
+ Our training pipeline is almost identical to that of [HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1), aside from minor hyperparameter changes.
88
+ Please check out Huggingface's [alignment handbook](https://github.com/huggingface/alignment-handbook?tab=readme-ov-file) for further details, including the chat template.
89
+
90
+ ### Training Data
91
+
92
+ Detailed descriptions regarding training data will be announced later.
93
+
94
+ ### Examples
95
+
96
+ <!--<a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
97
+ <img src="https://github.com/0110tpwls/project/blob/master/image (8).png?raw=true" width="80%"/>
98
+ </a> -->
99
+ <pre style="white-space: pre-wrap;">
100
+ User:
101
+ ๋„ˆ๋Š” ๋ˆ„๊ตฌ์ด๊ณ  ์–ด๋–ค ์ผ์„ ํ•  ์ˆ˜ ์žˆ์–ด?
102
+ Model:
103
+ ์ €๋Š” ์ธ๊ณต์ง€๋Šฅ ์–ธ์–ด ๋ชจ๋ธ์ธ AkaLlama(์•„์นด๋ผ๋งˆ)์ž…๋‹ˆ๋‹ค. ์ €๋Š” ๋‹ค์–‘ํ•œ ์ฃผ์ œ์™€ ํ˜•์‹์˜ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์งˆ๋ฌธ์— ๋‹ตํ•˜๊ณ , ๋Œ€ํ™”๋ฅผ ์ง„ํ–‰ํ•˜๋ฉฐ, ์ด์•ผ๊ธฐ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ ์ง€์ •ํ•œ ์ฃผ์ œ๋‚˜ ์žฅ๋ฅด์— ๋”ฐ๋ผ ์ฐฝ์˜์ ์ธ ๊ธ€์“ฐ๊ธฐ๋‚˜ ํŽธ์ง‘ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
104
+ </pre>
105
+ <details>
106
+
107
+ <summary><b>Math Solving[CLICK TO EXPAND]</b></summary>
108
+ <!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
109
+ <img src="https://github.com/0110tpwls/project/blob/master/image (9).png?raw=true" width="80%"/>
110
+ </a> -->
111
+ <pre style="white-space: pre-wrap;">
112
+ User:
113
+ ์•„๋ž˜ ๋ฌธ์ œ๋ฅผ ๊นŠ๊ฒŒ ์ƒ๊ฐํ•˜๋ฉด์„œ ํ’€์–ด๋ด
114
+ ์—ฐํ•„ 5 ํƒ€์™€ ์ง€์šฐ๊ฐœ 45 ๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ๋  ์ˆ˜ ์žˆ๋Š” ๋Œ€๋กœ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚จ๊น€์—†์ด ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ฃผ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ช‡ ๋ช…๊นŒ์ง€ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
115
+ Model:
116
+ ์•„๋ž˜ ๋ฌธ์ œ๋ฅผ ๊นŠ๊ฒŒ ์ƒ๊ฐํ•˜๋ฉด์„œ ํ’€์–ด๋ด…์‹œ๋‹ค:
117
+ ์—ฐํ•„ 5 ํƒ€์™€ ์ง€์šฐ๊ฐœ 45 ๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ๋  ์ˆ˜ ์žˆ๋Š” ๋Œ€๋กœ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚จ๊น€์—†์ด ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ฃผ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ช‡ ๋ช…๊นŒ์ง€ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
118
+ ํ’€์ด:
119
+ 1. ์—ฐํ•„ 5 ํƒ€๋ฅผ 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ ์„œ ๋ช‡ ๋ช…์—๊ฒŒ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๋จผ์ € ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
120
+ - ๊ฐ ์—ฐํ•„์—๋Š” 12๊ฐœ์˜ ์—ฐํ•„์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด 12 * 5 = 60๊ฐœ์ž…๋‹ˆ๋‹ค.
121
+ - 60๊ฐœ๋ฅผ 1๋ช…๋‹น 1๊ฐœ์”ฉ ๋‚˜๋ˆ„๋ฉด 60๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
122
+ 2. ์ง€์šฐ๊ฐœ 45 ๊ฐœ๋ฅผ 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ„์–ด ๋ช‡ ๋ช…์—๊ฒŒ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
123
+ - 45 ๊ฐœ๋ฅผ 1 ๋ช…๋‹น 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ„๋ฉด 45 ๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
124
+ 3. ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ์˜ ๋‘ ๊ฐ€์ง€ ๋ฌผํ’ˆ์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•˜์—ฌ ๊ฐ€์žฅ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์Šต๋‹ˆ๋‹ค.
125
+ - ์—ฐํ•„๋กœ๋งŒ 60๋ช…, ์ง€์šฐ๊ฐœ๋กœ๋งŒ 45๋ช…์˜ ํ•™์ƒ์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ๋‘ ์ข…๋ฅ˜์˜ ๋ฌธ๊ตฌ๋ฅผ ํ•จ๊ป˜ ๊ณ ๋ คํ•  ๋•Œ๋Š” ์ด๋ณด๋‹ค ์ ์€ ์ˆ˜์˜ ํ•™์ƒ์—๊ฒŒ ๋‚˜๋ˆ ์ฃผ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
126
+ - ์ด ์ค‘์—์„œ ๊ณตํ†ต์ธ์ˆ˜(common divisor)๋ฅผ ์ฐพ์•„์„œ ์ด๋ฅผ ์ตœ์†Œ ๋‹จ์œ„๋กœ ์„ค์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
127
+ - 60๊ณผ 45์˜ ์ตœ๋Œ€๊ณต์•ฝ์ˆ˜(GCD)๋Š” 15์ž…๋‹ˆ๋‹ค.
128
+ 4. ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ ๊ฐ๊ฐ์„ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๋ช‡ ๊ฐœ์”ฉ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
129
+ - ์—ฐํ•„์€ 60 / 15 = 4๊ฐœ์”ฉ, ์ด 4 * 12 = 48๊ฐœ์˜ ์—ฐํ•„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
130
+ - ์ง€์šฐ๊ฐœ๋Š” 45 / 15 = 3๊ฐœ์”ฉ, ์ด 3 * 1 = 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
131
+ 5. ์ตœ์ข…์ ์œผ๋กœ, ๊ฐ€์žฅ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์€ ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ๋ฅผ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ํ•™์ƒ์—๊ฒŒ 4๊ฐœ์˜ ์—ฐํ•„๊ณผ 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
132
+ - ์ด ๊ฒฝ์šฐ ์ด 48 + 3 = 51๊ฐœ์˜ ๋ฌผํ’ˆ์„ ์‚ฌ์šฉํ•˜์—ฌ 15๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
133
+ ๊ฒฐ๋ก : ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ ๊ฐ๊ฐ์„ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ํ•™์ƒ์—๊ฒŒ 4๊ฐœ์˜ ์—ฐํ•„๊ณผ 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ตœ๋Œ€ 15๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
134
+ </pre>
135
+ </details>
136
+
137
+ <details>
138
+ <summary><b>Writting[CLICK TO EXPAND]</b></summary>
139
+ <!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
140
+ <img src="https://github.com/0110tpwls/project/blob/master/image (13).png?raw=true" width="80%"/>
141
+ </a> -->
142
+
143
+ <pre style="white-space: pre-wrap;">
144
+ User:
145
+ ์•„๋ž˜ ๋ฌธ์žฅ์„ ๋ณด๊ณ  ๋ฌธ๋ฒ•์ ์œผ๋กœ ํ‹€๋ฆฐ๋ถ€๋ถ„์„ ์ง€์ ํ•ด์ค˜
146
+ ๊ทธ๋Š” ์„œ์Šด์น˜ ์•Š๊ณ  ๋น„ํ‰ํ•œ๋‹ค.
147
+ Model:
148
+ ์„œ์Šด์น˜"๋ผ๋Š” ๋‹จ์–ด๋Š” "์„œ์Šด"์ด๋ผ๋Š” ๋™์‚ฌ์™€ "์น˜"๋ผ๋Š” ์ ‘๋ฏธ์‚ฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ๋“  ๋‹จ์–ด์ž…๋‹ˆ๋‹ค. ์ด ๋‹จ์–ด๋Š” "์„œ์Šด"์˜ ์˜๋ฏธ๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์กฐ๋™์‚ฌ์  ์šฉ๋ฒ•์œผ๋กœ, "์„œ์Šด"์ด๋ž€ ๋™์‚ฌ์˜ ๋ฐ˜๋ณต์ ์ธ ํ–‰๋™์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ฌธ์žฅ์—์„œ "์„œ์Šด์น˜"๋ผ๋Š” ๋‹จ์–ด๋Š” ์ค‘๋ณต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์–ด ๋ฌธ๋ฒ•์ ์œผ๋กœ ํ‹€๋ฆฐ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.
149
+
150
+ ์ •๋‹ต์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
151
+ ๊ทธ๋Š” ์„œ์Šด์—†์ด ๋น„ํ‰ํ•œ๋‹ค.
152
+ </pre>
153
+
154
+ <!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
155
+ <img src="https://github.com/0110tpwls/project/blob/master/image (7).png?raw=true" width="80%"/>
156
+ </a> -->
157
+
158
+ <pre style="white-space: pre-wrap;">
159
+ User:
160
+ ์œค๋™์ฃผ ์‹œ์ธ์˜ "๋ณ„ ํ—ค๋Š” ๋ฐค"์„ ๋ณด๊ณ  ๋‹น์‹œ ์‹œ๋Œ€์ƒ์„ ๊ณ ๋ คํ•ด์„œ ํ•ด์„ํ•ด๋ด. ๊ทธ๋ฆฌ๊ณ  ์ด๋ฅผ ์ž‘๊ฐ€์˜ ๋…๋ฐฑํ˜•์‹์œผ๋กœ ๋‹ค์‹œ ์ž‘์„ฑํ•ด๋ด.
161
+ Model:
162
+ ์œค๋™์ฃผ๋Š” 1941๋…„ ์ผ๋ณธ ์ œ๊ตญ์ฃผ์˜ ์น˜ํ•˜์—์„œ ์ด ์ž‘ํ’ˆ์„ ์“ฐ๋ฉด์„œ, ๊ฐ•์••์ ์ธ ์ƒํ™ฉ ์†์—์„œ๋„ ์ž์‹ ์˜ ์ •์ฒด์„ฑ๊ณผ ๋ฌธํ™”์— ๋Œ€ํ•œ ์• ์ฐฉ์„ ํ‘œํ˜„ํ•˜๊ณ ์ž ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Š” ํ•˜๋Š˜์— ๊ฐ€๋“ ์ฐฌ ๋ณ„๋“ค์„ ๋ณด๋ฉฐ ๊ทธ ์†์— ํ•˜๋‚˜ ๋‘˜ ์ƒˆ๊ฒจ์ง€๋Š” ๋ณ„๋“ค์ด ์žˆ๋‹ค๊ณ  ํ‘œํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ณ„๋“ค์€ ๊ฐ๊ฐ ์ถ”์–ต, ์‚ฌ๋ž‘, ์“ธ์“ธํ•จ, ๋™๊ฒฝ, ์‹œ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ฐ์ •์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
163
+ ์œค๋™์ฃผ๋Š” ์ด ๏ฟฝ๏ฟฝ๋“ค ์ค‘ ํ•˜๋‚˜์— ์–ด๋จธ๋‹ˆ๋ฅผ ๋ถ€๋ฅด๋ฉฐ ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ ์žˆ๋Š” ๊ฒƒ์„ ๊ทธ๋ฆฌ์›Œํ•˜๋ฉฐ, ์ž์‹ ์˜ ์ด๋ฆ„์ž๋ฅผ ์–ธ๋• ์œ„์— ์“ฐ๊ณ  ํ™์œผ๋กœ ๋ฎ์–ด ๋ฒ„๋ฆฐ๋‹ค๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ–‰๋™์€ ์ž์‹ ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ , ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
164
+ ์œค๋™์ฃผ๋Š” ๋˜ํ•œ ์ž์‹ ์ด ์ฃฝ๊ณ  ๋‚˜๋ฉด ๋ฌด๋ค ์œ„์— ํŒŒ๋ž€ ์ž”๋””๊ฐ€ ํ”ผ์–ด๋‚˜๋“ฏ์ด ์ž์‹ ์˜ ์ด๋ฆ„์ž ๋ฌปํžŒ ์–ธ๋• ์œ„์—๋„ ํ’€์ด ๋ฌด์„ฑํ•  ๊ฒƒ์ด๋ผ๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋‹น์‹œ์˜ ์‚ฌํšŒ์ƒ๊ณผ ์ž‘๊ฐ€์˜ ๋‚ด๋ฉด์„ ๋ฐ˜์˜ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ํ•œ๊ตญ์ธ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ  ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ, ๊ทธ๋ฆฌ๊ณ  ์ฃฝ์Œ ์ดํ›„์— ์ž์‹ ์˜ ์กด์žฌ๋ฅผ ๋‚จ๊ฒจ๋‘๊ณ  ์‹ถ๋‹ค๋Š” ๋ฐ”๋žŒ์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
165
+ ์ด ์‹œ๋ฅผ ์ž‘๊ฐ€์˜ ๋…๋ฐฑํ˜•์‹์œผ๋กœ ๋‹ค์‹œ ์ž‘์„ฑํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
166
+ "์ €๋Š” ์ด ๋ณ„๋“ค์ด ๋„ˆ๋ฌด๋‚˜ ๋งŽ์€ ๊ฒƒ ๊ฐ™์•„์š”. ํ•˜๋‚˜ ๋‘˜ ์ƒˆ๊ฒจ์ง€๋Š” ๋ณ„๋“ค ์ค‘์—๋Š” ์ถ”์–ต๋„ ์žˆ๊ณ , ์‚ฌ๋ž‘๋„ ์žˆ๊ณ , ์“ธ์“ธํ•จ๊ณผ ๋™๊ฒฝ๋„ ์žˆ์–ด์š”. ๊ทธ๋ฆฌ๊ณ  ์–ด๋จธ๋‹ˆ, ๋‹น์‹ ์€ ๋ฉ€๋ฆฌ ๋ถ๊ฐ„๋„์— ๊ณ„์‹  ๊ฑด๊ฐ€์š”? ์ €๋Š” ๋‹น์‹ ์„ ๋ถ€๋ฅด๊ณ  ์‹ถ์–ด์š”.
167
+ ์ €์˜ ์ด๋ฆ„์ž๋Š” ์–ธ๋• ์œ„์— ์“ฐ๊ณ  ํ™์œผ๋กœ ๋ฎ์–ด ๋ฒ„๋ ธ์–ด์š”. ๋”ด์€ ๋ฐค์„ ์ƒˆ์›Œ ์šฐ๋Š” ๋ฒŒ๋ ˆ์ฒ˜๋Ÿผ ๋ถ€๋„๋Ÿฌ์šด ์ด๋ฆ„์„ ์Šฌํผํ•˜๋Š” ์ด์œ ๊ฐ€ ์žˆ์–ด์š”. ํ•˜์ง€๋งŒ ๊ฒจ์šธ์ด ์ง€๋‚˜๊ณ  ๋‚˜์˜ ๋ณ„์—๋„ ๋ด„์ด ์˜ค๋ฉด, ๋ฌด๋ค ์œ„์— ํŒŒ๋ž€ ์ž”๋””๊ฐ€ ํ”ผ์–ด๋‚˜๋“ฏ์ด ์ €์˜ ์ด๋ฆ„์ž ๋ฌปํžŒ ์–ธ๋• ์œ„์—์„œ๋„ ํ’€์ด ๋ฌด์„ฑํ•  ๊ฑฐ์˜ˆ์š”.
168
+ ์ €๋Š” ์ด ๋ณ„๋“ค์„ ๋ณด๊ณ , ์ž์‹ ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ  ์‹ถ์–ด์š”. ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ์„ ํ•˜๊ณ  ์‹ถ์–ด์š”. ์ฃฝ์Œ ์ดํ›„์—๋„ ์ž์‹ ์˜ ์กด์žฌ๋ฅผ ๋‚จ๊ฒจ๋‘๊ณ  ์‹ถ์–ด์š”."
169
+ </pre>
170
+ </details>
171
+
172
+ <details>
173
+ <summary><b>Logical Reasoning[CLICK TO EXPAND]</b></summary>
174
+ <!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
175
+ <img src="https://github.com/0110tpwls/project/blob/master/image (15).png?raw=true" width="80%"/>
176
+ </a> -->
177
+ <pre style="white-space: pre-wrap;">
178
+ User:
179
+ ์„ธ์ข…๋Œ€์™•์˜ ๋งฅ๋ถํ”„๋กœ ๋˜์ง ์‚ฌ๊ฑด์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด์ค˜
180
+ Model:
181
+ ์„ธ์ข…๋Œ€์™•์ด ๋งฅ๋ถํ”„๋กœ๋ฅผ ๋˜์ง„ ์‚ฌ๊ฑด์€ ์‚ฌ์‹ค๋ฌด๊ทผํ•œ ์ด์•ผ๊ธฐ์ž…๋‹ˆ๋‹ค. ์„ธ์ข…๋Œ€์™•์€ ์กฐ์„  ์ œ4๋Œ€ ๊ตญ์™•์œผ๋กœ 1418๋…„๋ถ€ํ„ฐ 1450๋…„๊นŒ์ง€ ์žฌ์œ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Š” ๊ณผํ•™, ๋ฌธํ•™, ์˜ˆ์ˆ  ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์— ๊ด€์‹ฌ์„ ๊ฐ–๊ณ  ์—ฐ๊ตฌํ•˜์˜€์œผ๋ฉฐ, ํŠนํžˆ ํ•œ๊ธ€ ์ฐฝ์ œ์™€ ์ฒœ๋ฌธํ•™์— ํฐ ๊ณตํ—Œ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค.
182
+ ํ•œํŽธ ๋งฅ๋ถํ”„๋กœ๋Š” ์• ํ”Œ์ด ์ƒ์‚ฐํ•˜๋Š” ๋…ธํŠธ๋ถ ์ปดํ“จํ„ฐ๋กœ, ์„ธ์ข…๋Œ€์™• ์‹œ๋Œ€์˜ ๊ธฐ์ˆ ๋กœ๋Š” ์กด์žฌํ•  ์ˆ˜ ์—†๋Š” ์ œํ’ˆ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์„ธ์ข…๋Œ€์™•์ด ๋งฅ๋ถํ”„๋กœ๋ฅผ ๋˜์ง„ ์‚ฌ๊ฑด์€ ์‚ฌ์‹ค๊ณผ ๋งž์ง€ ์•Š๋Š” ์ด์•ผ๊ธฐ๊ฐ€ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
183
+ </pre>
184
+ </details>
185
+
186
+ <details>
187
+ <summary><b>Coding [CLICK TO EXPAND]</b></summary>
188
+ <!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
189
+ <img src="https://github.com/0110tpwls/project/blob/master/image (11).png?raw=true" width="80%"/>
190
+ </a> -->
191
+
192
+ <pre style="white-space: pre-wrap;">
193
+ User:
194
+ PyTorch๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ฐ„๋‹จํ•œ CNN์„ ๊ตฌํ˜„ํ•ด๋ด.
195
+ Model:
196
+ ๋„ค, ๋ฌผ๋ก ์ด์—์š”! PyTorch๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•œ CNN์„ ๊ตฌํ˜„ํ•˜๋Š” ์˜ˆ์ œ ์ฝ”๋“œ๋ฅผ ๋ณด์—ฌ๋“œ๋ฆด๊ฒŒ์š”.
197
+ ์šฐ์„ , ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€์ ธ์™€์•ผ ํ•ด์š”. PyTorch์˜ `torch`์™€ `nn` ๋ชจ๋“ˆ์ด ํ•„์š”ํ•ด์š”.
198
+ ```python
199
+ import torch\nimport torch.nn as nn
200
+ ```
201
+ ๊ทธ ๋‹ค์Œ, ๊ฐ„๋‹จํ•œ CNN ๋ชจ๋ธ์„ ์ •์˜ํ•  ๊ฑฐ์˜ˆ์š”. ์ด ๊ฒฝ์šฐ์—๋Š” 2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด์™€ 3๊ฐœ์˜ ํ’€๋ง ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ชจ๋ธ์ด์—์š”.
202
+ ```python
203
+ class SimpleCNN(nn.Module):
204
+ def __init__(self):
205
+ super(SimpleCNN, self).__init__()
206
+ self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
207
+ self.pool = nn.MaxPool2d(2, 2)
208
+ self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
209
+
210
+ def forward(self, x):
211
+ x = self.pool(torch.relu(self.conv1(x)))
212
+ x = self.pool(torch.relu(self.conv2(x)))
213
+ return x
214
+ ```
215
+ ์ด์ œ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•˜๊ณ , ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•ด์•ผ ํ•ด์š”. ์—ฌ๊ธฐ์„œ๋Š” ์ž„์˜์˜ 28x28 ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ–ˆ์–ด์š”.
216
+ ```python
217
+ model = SimpleCNN()
218
+ input_data = torch.randn(1, 1, 28, 28)
219
+ ```
220
+ ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ชจ๋ธ์— ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด ์ถœ๋ ฅ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์š”.
221
+ ```python
222
+ output = model(input_data)
223
+ print(output.shape)
224
+ ```
225
+ ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๊ฐ„๋‹จํ•œ CNN ๋ชจ๋ธ์„ PyTorch๋ฅผ ์‚ฌ์šฉํ•ด ๊ตฌํ˜„ํ•˜๊ณ , ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์ถœ๋ ฅ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์š”. ๋ฌผ๋ก  ์‹ค์ œ๋กœ๋Š” ๋” ๋งŽ์€ ๋ ˆ์ด์–ด๋‚˜ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ ๋“ฑ์ด ํ•„์š”ํ•˜์ง€๋งŒ, ์ด ์ฝ”๋“œ๋Š” ๊ธฐ๋ณธ์ ์ธ CNN ๊ตฌ์กฐ๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ์˜ˆ์ œ์ผ ๋ฟ์ด์—์š”. ๋” ๊ถ๊ธˆํ•œ ์ ์ด ์žˆ์œผ๋ฉด ์–ธ์ œ๋“ ์ง€ ๋ฌผ์–ด๋ด ์ฃผ์„ธ์š”!
226
+ </pre>
227
+ </details>
228
+
229
+ You can find more examples at [our project page](https://yonsei-mir.github.io/AkaLLaMA-page)
230
+
231
+
232
+
233
+ ## Warning
234
+
235
+ Although AKALlama-70B has significant potential, its responses can sometimes be inaccurate, biased, or misaligned, presenting risks if used without additional testing and refinement. Furthermore, the quality of the model's output is greatly influenced by the system prompt and decoding strategy. Changes in these areas could result in less precise outputs. Therefore, we strongly recommend handling our model with considerable caution.
236
+
237
+ ## Citation
238
+ If you find our model useful in your work, please cite the model as follows:
239
+
240
+ @misc{akallama,
241
+ author = {Chung, Jiwan and Jeon, Jaehyun and Kim, Saejin and Lim, Seungwon and Oh, Giyeong and Son, Yejin and Yu, Youngjae},
242
+ title = {AkaLlama: Yonsei University Large Language Model Project},
243
+ year = {2024},
244
+ publisher = {Hugging Face},
245
+ journal = {Hugging Face repository},
246
+ howpublished = {\url{https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1}},
247
+ }
248
+
249
+ ## Contact
250
+ We look forward for your feedback and welcome collaboration on this exciting project!
251
+
252
+ ### Contributors
253
+ - [YoungJaeYu](https://yj-yu.github.io/home/)
254
+ - [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
255
+
256
+ ## Special Thanks
257
+
258
+ - Data Center of the Department of Artificial Intelligence at Yonsei University for the computation resources
259
+ ## Acknowledgement
260
+
261
+ - Title image generated by DALLยทE 3