BootsofLagrangian's picture
Update README.md
968a58f verified
|
raw
history blame
No virus
16.1 kB
---
libray_name: transformers
pipeline_tag: text-generation
license: other
license_name: llama3
license_link: LICENSE
language:
- ko
- en
tags:
- meta
- llama
- llama-3
- akallama
library_name: transformers
inference: false
---
<a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image_720.png?raw=true" width="40%"/>
</a>
# AKALLAMA
AkaLlama is a series of Korean language models designed for practical usability across a wide range of tasks.
The initial model, AkaLlama-v0.1, is a fine-tuned version of Meta-Llama-3-70b-Instruct. It has been trained on a custom mix of publicly available datasets curated by the MIR Lab.
Our goal is to explore cost-effective ways to adapt high-performing LLMs for specific use cases, such as different languages (e.g., Korean) or domains (e.g., organization-specific chatbots).
For details, check out [our project page](https://yonsei-mir.github.io/AkaLLaMA-page).
### Model Description
This repo provides ExLlamav2 weight files for AkaLlama-70B-v0.1.
- **Developed by:** [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
- **Language(s) (NLP):** Korean, English
- **License:** llama3
- **Finetuned from model:** [meta-llama/Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct)
- **Quantized from model:** [mirlab/AkaLlama-llama3-70b-v0.1](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1)
- **Dataset for Quantization:** [maywell/korean_textbooks/ko_wikidata](https://huggingface.co/datasets/maywell/korean_textbooks/tree/main/ko_wikidata)
Main branch of this repo **only** contains README.md. Please uses specific branch.
### Available Branch
| Name | Head Bits | avg Bits | Size | Max RAM required |
| :----: | :----: | :----: | :----: | :----: |
| [AkaLlama-llama3-70b-v0.1.2bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2bpwh8) | 8 | 2 | 21 GB| 23.5 GB |
| [AkaLlama-llama3-70b-v0.1.2.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.25bpwh8) | 8 | 2.25 | 21 GB| 23.5 GB |
| [AkaLlama-llama3-70b-v0.1.2.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/2.4bpwh8) | 8 | 2.4 | 23 GB| 25.5 GB |
| [AkaLlama-llama3-70b-v0.1.3bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/3bpwh8) | 8 | 3 | 27 GB| 29.5 GB |
| [AkaLlama-llama3-70b-v0.1.4bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4bpwh8) | 8 | 4 | 35 GB| 37.5 GB |
| [AkaLlama-llama3-70b-v0.1.4.25bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/4.25bpwh8) | 8 | 4.25 | 37 GB| 39.35 GB |
| [AkaLlama-llama3-70b-v0.1.5bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/5bpwh8) | 8 | 5 | 43 GB| 45.5 GB |
| [AkaLlama-llama3-70b-v0.1.6bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/6bpwh8) | 8 | 6 | 51 GB| 53.5 GB |
| [AkaLlama-llama3-70b-v0.1.8bpwh8](https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1-exl2/tree/8bpwh8) | 8 | 8 | 59 GB| 61.5 GB |
**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
### About ExLlamaV2
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.
### Use with ExLlamaV2 Third Party
They are also compatible with many third party UIs and libraries.
- [TabbyAPI](https://github.com/theroyallab/tabbyAPI/) is a FastAPI-based server that provides an OpenAI-style web API
compatible with [SillyTavern](https://sillytavernai.com/) and other frontends.
- [ExUI](https://github.com/turboderp/exui) is a simple, standalone single-user web UI that serves an ExLlamaV2 instance
directly with chat and notebook modes.
- [text-generation-webui](https://github.com/oobabooga/text-generation-webui) supports ExLlamaV2 through the **exllamav2**
and **exllamav2_HF** loaders.
- [lollms-webui](https://github.com/ParisNeo/lollms-webui) supports ExLlamaV2 through the exllamav2 binding.
## Training Details
### Training Procedure
We trained AkaLlama using a preference learning alignment algorithm called [Odds Ratio Preference Optimization (ORPO)](https://huggingface.co/papers/2403.07691).
Our training pipeline is almost identical to that of [HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1), aside from minor hyperparameter changes.
Please check out Huggingface's [alignment handbook](https://github.com/huggingface/alignment-handbook?tab=readme-ov-file) for further details, including the chat template.
### Training Data
Detailed descriptions regarding training data will be announced later.
### Examples
<!--<a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (8).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
๋„ˆ๋Š” ๋ˆ„๊ตฌ์ด๊ณ  ์–ด๋–ค ์ผ์„ ํ•  ์ˆ˜ ์žˆ์–ด?
Model:
์ €๋Š” ์ธ๊ณต์ง€๋Šฅ ์–ธ์–ด ๋ชจ๋ธ์ธ AkaLlama(์•„์นด๋ผ๋งˆ)์ž…๋‹ˆ๋‹ค. ์ €๋Š” ๋‹ค์–‘ํ•œ ์ฃผ์ œ์™€ ํ˜•์‹์˜ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์งˆ๋ฌธ์— ๋‹ตํ•˜๊ณ , ๋Œ€ํ™”๋ฅผ ์ง„ํ–‰ํ•˜๋ฉฐ, ์ด์•ผ๊ธฐ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ ์ง€์ •ํ•œ ์ฃผ์ œ๋‚˜ ์žฅ๋ฅด์— ๋”ฐ๋ผ ์ฐฝ์˜์ ์ธ ๊ธ€์“ฐ๊ธฐ๋‚˜ ํŽธ์ง‘ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
</pre>
<details>
<summary><b>Math Solving[CLICK TO EXPAND]</b></summary>
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (9).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
์•„๋ž˜ ๋ฌธ์ œ๋ฅผ ๊นŠ๊ฒŒ ์ƒ๊ฐํ•˜๋ฉด์„œ ํ’€์–ด๋ด
์—ฐํ•„ 5 ํƒ€์™€ ์ง€์šฐ๊ฐœ 45 ๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ๋  ์ˆ˜ ์žˆ๋Š” ๋Œ€๋กœ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚จ๊น€์—†์ด ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ฃผ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ช‡ ๋ช…๊นŒ์ง€ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
Model:
์•„๋ž˜ ๋ฌธ์ œ๋ฅผ ๊นŠ๊ฒŒ ์ƒ๊ฐํ•˜๋ฉด์„œ ํ’€์–ด๋ด…์‹œ๋‹ค:
์—ฐํ•„ 5 ํƒ€์™€ ์ง€์šฐ๊ฐœ 45 ๊ฐœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ๋  ์ˆ˜ ์žˆ๋Š” ๋Œ€๋กœ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚จ๊น€์—†์ด ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ฃผ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ช‡ ๋ช…๊นŒ์ง€ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
ํ’€์ด:
1. ์—ฐํ•„ 5 ํƒ€๋ฅผ 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ ์„œ ๋ช‡ ๋ช…์—๊ฒŒ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๋จผ์ € ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
- ๊ฐ ์—ฐํ•„์—๋Š” 12๊ฐœ์˜ ์—ฐํ•„์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด 12 * 5 = 60๊ฐœ์ž…๋‹ˆ๋‹ค.
- 60๊ฐœ๋ฅผ 1๋ช…๋‹น 1๊ฐœ์”ฉ ๋‚˜๋ˆ„๋ฉด 60๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2. ์ง€์šฐ๊ฐœ 45 ๊ฐœ๋ฅผ 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ„์–ด ๋ช‡ ๋ช…์—๊ฒŒ ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
- 45 ๊ฐœ๋ฅผ 1 ๋ช…๋‹น 1 ๊ฐœ์”ฉ ๋‚˜๋ˆ„๋ฉด 45 ๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
3. ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ์˜ ๋‘ ๊ฐ€์ง€ ๋ฌผํ’ˆ์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•˜์—ฌ ๊ฐ€์žฅ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ์Šต๋‹ˆ๋‹ค.
- ์—ฐํ•„๋กœ๋งŒ 60๋ช…, ์ง€์šฐ๊ฐœ๋กœ๋งŒ 45๋ช…์˜ ํ•™์ƒ์—๊ฒŒ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ๋‘ ์ข…๋ฅ˜์˜ ๋ฌธ๊ตฌ๋ฅผ ํ•จ๊ป˜ ๊ณ ๋ คํ•  ๋•Œ๋Š” ์ด๋ณด๋‹ค ์ ์€ ์ˆ˜์˜ ํ•™์ƒ์—๊ฒŒ ๋‚˜๋ˆ ์ฃผ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ์ด ์ค‘์—์„œ ๊ณตํ†ต์ธ์ˆ˜(common divisor)๋ฅผ ์ฐพ์•„์„œ ์ด๋ฅผ ์ตœ์†Œ ๋‹จ์œ„๋กœ ์„ค์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
- 60๊ณผ 45์˜ ์ตœ๋Œ€๊ณต์•ฝ์ˆ˜(GCD)๋Š” 15์ž…๋‹ˆ๋‹ค.
4. ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ ๊ฐ๊ฐ์„ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๋ช‡ ๊ฐœ์”ฉ ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
- ์—ฐํ•„์€ 60 / 15 = 4๊ฐœ์”ฉ, ์ด 4 * 12 = 48๊ฐœ์˜ ์—ฐํ•„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
- ์ง€์šฐ๊ฐœ๋Š” 45 / 15 = 3๊ฐœ์”ฉ, ์ด 3 * 1 = 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
5. ์ตœ์ข…์ ์œผ๋กœ, ๊ฐ€์žฅ ๋งŽ์€ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์€ ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ๋ฅผ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ํ•™์ƒ์—๊ฒŒ 4๊ฐœ์˜ ์—ฐํ•„๊ณผ 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
- ์ด ๊ฒฝ์šฐ ์ด 48 + 3 = 51๊ฐœ์˜ ๋ฌผํ’ˆ์„ ์‚ฌ์šฉํ•˜์—ฌ 15๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๊ฒฐ๋ก : ์—ฐํ•„๊ณผ ์ง€์šฐ๊ฐœ ๊ฐ๊ฐ์„ GCD(15)๋กœ ๋‚˜๋ˆ„์–ด ๊ฐ ํ•™์ƒ์—๊ฒŒ 4๊ฐœ์˜ ์—ฐํ•„๊ณผ 3๊ฐœ์˜ ์ง€์šฐ๊ฐœ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ตœ๋Œ€ 15๋ช…์˜ ํ•™์ƒ๋“ค์—๊ฒŒ ๋˜‘๊ฐ™์ด ๋‚˜๋ˆ„์–ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
</pre>
</details>
<details>
<summary><b>Writting[CLICK TO EXPAND]</b></summary>
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (13).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
์•„๋ž˜ ๋ฌธ์žฅ์„ ๋ณด๊ณ  ๋ฌธ๋ฒ•์ ์œผ๋กœ ํ‹€๋ฆฐ๋ถ€๋ถ„์„ ์ง€์ ํ•ด์ค˜
๊ทธ๋Š” ์„œ์Šด์น˜ ์•Š๊ณ  ๋น„ํ‰ํ•œ๋‹ค.
Model:
์„œ์Šด์น˜"๋ผ๋Š” ๋‹จ์–ด๋Š” "์„œ์Šด"์ด๋ผ๋Š” ๋™์‚ฌ์™€ "์น˜"๋ผ๋Š” ์ ‘๋ฏธ์‚ฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ๋“  ๋‹จ์–ด์ž…๋‹ˆ๋‹ค. ์ด ๋‹จ์–ด๋Š” "์„œ์Šด"์˜ ์˜๋ฏธ๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์กฐ๋™์‚ฌ์  ์šฉ๋ฒ•์œผ๋กœ, "์„œ์Šด"์ด๋ž€ ๋™์‚ฌ์˜ ๋ฐ˜๋ณต์ ์ธ ํ–‰๋™์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ฌธ์žฅ์—์„œ "์„œ์Šด์น˜"๋ผ๋Š” ๋‹จ์–ด๋Š” ์ค‘๋ณต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜์–ด ๋ฌธ๋ฒ•์ ์œผ๋กœ ํ‹€๋ฆฐ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.
์ •๋‹ต์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
๊ทธ๋Š” ์„œ์Šด์—†์ด ๋น„ํ‰ํ•œ๋‹ค.
</pre>
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (7).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
์œค๋™์ฃผ ์‹œ์ธ์˜ "๋ณ„ ํ—ค๋Š” ๋ฐค"์„ ๋ณด๊ณ  ๋‹น์‹œ ์‹œ๋Œ€์ƒ์„ ๊ณ ๋ คํ•ด์„œ ํ•ด์„ํ•ด๋ด. ๊ทธ๋ฆฌ๊ณ  ์ด๋ฅผ ์ž‘๊ฐ€์˜ ๋…๋ฐฑํ˜•์‹์œผ๋กœ ๋‹ค์‹œ ์ž‘์„ฑํ•ด๋ด.
Model:
์œค๋™์ฃผ๋Š” 1941๋…„ ์ผ๋ณธ ์ œ๊ตญ์ฃผ์˜ ์น˜ํ•˜์—์„œ ์ด ์ž‘ํ’ˆ์„ ์“ฐ๋ฉด์„œ, ๊ฐ•์••์ ์ธ ์ƒํ™ฉ ์†์—์„œ๋„ ์ž์‹ ์˜ ์ •์ฒด์„ฑ๊ณผ ๋ฌธํ™”์— ๋Œ€ํ•œ ์• ์ฐฉ์„ ํ‘œํ˜„ํ•˜๊ณ ์ž ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Š” ํ•˜๋Š˜์— ๊ฐ€๋“ ์ฐฌ ๋ณ„๋“ค์„ ๋ณด๋ฉฐ ๊ทธ ์†์— ํ•˜๋‚˜ ๋‘˜ ์ƒˆ๊ฒจ์ง€๋Š” ๋ณ„๋“ค์ด ์žˆ๋‹ค๊ณ  ํ‘œํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ณ„๋“ค์€ ๊ฐ๊ฐ ์ถ”์–ต, ์‚ฌ๋ž‘, ์“ธ์“ธํ•จ, ๋™๊ฒฝ, ์‹œ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ฐ์ •์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์œค๋™์ฃผ๋Š” ์ด ๋ณ„๋“ค ์ค‘ ํ•˜๋‚˜์— ์–ด๋จธ๋‹ˆ๋ฅผ ๋ถ€๋ฅด๋ฉฐ ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ ์žˆ๋Š” ๊ฒƒ์„ ๊ทธ๋ฆฌ์›Œํ•˜๋ฉฐ, ์ž์‹ ์˜ ์ด๋ฆ„์ž๋ฅผ ์–ธ๋• ์œ„์— ์“ฐ๊ณ  ํ™์œผ๋กœ ๋ฎ์–ด ๋ฒ„๋ฆฐ๋‹ค๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ–‰๋™์€ ์ž์‹ ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ , ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์œค๋™์ฃผ๋Š” ๋˜ํ•œ ์ž์‹ ์ด ์ฃฝ๊ณ  ๋‚˜๋ฉด ๋ฌด๋ค ์œ„์— ํŒŒ๋ž€ ์ž”๋””๊ฐ€ ํ”ผ์–ด๋‚˜๋“ฏ์ด ์ž์‹ ์˜ ์ด๋ฆ„์ž ๋ฌปํžŒ ์–ธ๋• ์œ„์—๋„ ํ’€์ด ๋ฌด์„ฑํ•  ๊ฒƒ์ด๋ผ๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋‹น์‹œ์˜ ์‚ฌํšŒ์ƒ๊ณผ ์ž‘๊ฐ€์˜ ๋‚ด๋ฉด์„ ๋ฐ˜์˜ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ํ•œ๊ตญ์ธ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ  ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ, ๊ทธ๋ฆฌ๊ณ  ์ฃฝ์Œ ์ดํ›„์— ์ž์‹ ์˜ ์กด์žฌ๋ฅผ ๋‚จ๊ฒจ๋‘๊ณ  ์‹ถ๋‹ค๋Š” ๋ฐ”๋žŒ์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์‹œ๋ฅผ ์ž‘๊ฐ€์˜ ๋…๋ฐฑํ˜•์‹์œผ๋กœ ๋‹ค์‹œ ์ž‘์„ฑํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
"์ €๋Š” ์ด ๋ณ„๋“ค์ด ๋„ˆ๋ฌด๋‚˜ ๋งŽ์€ ๊ฒƒ ๊ฐ™์•„์š”. ํ•˜๋‚˜ ๋‘˜ ์ƒˆ๊ฒจ์ง€๋Š” ๋ณ„๋“ค ์ค‘์—๋Š” ์ถ”์–ต๋„ ์žˆ๊ณ , ์‚ฌ๋ž‘๋„ ์žˆ๊ณ , ์“ธ์“ธํ•จ๊ณผ ๋™๊ฒฝ๋„ ์žˆ์–ด์š”. ๊ทธ๋ฆฌ๊ณ  ์–ด๋จธ๋‹ˆ, ๋‹น์‹ ์€ ๋ฉ€๋ฆฌ ๋ถ๊ฐ„๋„์— ๊ณ„์‹  ๊ฑด๊ฐ€์š”? ์ €๋Š” ๋‹น์‹ ์„ ๋ถ€๋ฅด๊ณ  ์‹ถ์–ด์š”.
์ €์˜ ์ด๋ฆ„์ž๋Š” ์–ธ๋• ์œ„์— ์“ฐ๊ณ  ํ™์œผ๋กœ ๋ฎ์–ด ๋ฒ„๋ ธ์–ด์š”. ๋”ด์€ ๋ฐค์„ ์ƒˆ์›Œ ์šฐ๋Š” ๋ฒŒ๋ ˆ์ฒ˜๋Ÿผ ๋ถ€๋„๋Ÿฌ์šด ์ด๋ฆ„์„ ์Šฌํผํ•˜๋Š” ์ด์œ ๊ฐ€ ์žˆ์–ด์š”. ํ•˜์ง€๋งŒ ๊ฒจ์šธ์ด ์ง€๋‚˜๊ณ  ๋‚˜์˜ ๋ณ„์—๋„ ๋ด„์ด ์˜ค๋ฉด, ๋ฌด๋ค ์œ„์— ํŒŒ๋ž€ ์ž”๋””๊ฐ€ ํ”ผ์–ด๋‚˜๋“ฏ์ด ์ €์˜ ์ด๋ฆ„์ž ๋ฌปํžŒ ์–ธ๋• ์œ„์—์„œ๋„ ํ’€์ด ๋ฌด์„ฑํ•  ๊ฑฐ์˜ˆ์š”.
์ €๋Š” ์ด ๋ณ„๋“ค์„ ๋ณด๊ณ , ์ž์‹ ์˜ ์ •์ฒด์„ฑ์„ ํ™•์ธํ•˜๊ณ  ์‹ถ์–ด์š”. ์žƒ์–ด๋ฒ„๋ฆฐ ๊ฒƒ์„ ๋˜์ฐพ์œผ๋ ค๋Š” ๋…ธ๋ ฅ์„ ํ•˜๊ณ  ์‹ถ์–ด์š”. ์ฃฝ์Œ ์ดํ›„์—๋„ ์ž์‹ ์˜ ์กด์žฌ๋ฅผ ๋‚จ๊ฒจ๋‘๊ณ  ์‹ถ์–ด์š”."
</pre>
</details>
<details>
<summary><b>Logical Reasoning[CLICK TO EXPAND]</b></summary>
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (15).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
์„ธ์ข…๋Œ€์™•์˜ ๋งฅ๋ถํ”„๋กœ ๋˜์ง ์‚ฌ๊ฑด์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด์ค˜
Model:
์„ธ์ข…๋Œ€์™•์ด ๋งฅ๋ถํ”„๋กœ๋ฅผ ๋˜์ง„ ์‚ฌ๊ฑด์€ ์‚ฌ์‹ค๋ฌด๊ทผํ•œ ์ด์•ผ๊ธฐ์ž…๋‹ˆ๋‹ค. ์„ธ์ข…๋Œ€์™•์€ ์กฐ์„  ์ œ4๋Œ€ ๊ตญ์™•์œผ๋กœ 1418๋…„๋ถ€ํ„ฐ 1450๋…„๊นŒ์ง€ ์žฌ์œ„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Š” ๊ณผํ•™, ๋ฌธํ•™, ์˜ˆ์ˆ  ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์— ๊ด€์‹ฌ์„ ๊ฐ–๊ณ  ์—ฐ๊ตฌํ•˜์˜€์œผ๋ฉฐ, ํŠนํžˆ ํ•œ๊ธ€ ์ฐฝ์ œ์™€ ์ฒœ๋ฌธํ•™์— ํฐ ๊ณตํ—Œ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค.
ํ•œํŽธ ๋งฅ๋ถํ”„๋กœ๋Š” ์• ํ”Œ์ด ์ƒ์‚ฐํ•˜๋Š” ๋…ธํŠธ๋ถ ์ปดํ“จํ„ฐ๋กœ, ์„ธ์ข…๋Œ€์™• ์‹œ๋Œ€์˜ ๊ธฐ์ˆ ๋กœ๋Š” ์กด์žฌํ•  ์ˆ˜ ์—†๋Š” ์ œํ’ˆ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์„ธ์ข…๋Œ€์™•์ด ๋งฅ๋ถํ”„๋กœ๋ฅผ ๋˜์ง„ ์‚ฌ๊ฑด์€ ์‚ฌ์‹ค๊ณผ ๋งž์ง€ ์•Š๋Š” ์ด์•ผ๊ธฐ๊ฐ€ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
</pre>
</details>
<details>
<summary><b>Coding [CLICK TO EXPAND]</b></summary>
<!-- <a href="https://huggingface.co/collections/mirlab/akallama-66338859b09221f3607fdfcd">
<img src="https://github.com/0110tpwls/project/blob/master/image (11).png?raw=true" width="80%"/>
</a> -->
<pre style="white-space: pre-wrap;">
User:
PyTorch๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ฐ„๋‹จํ•œ CNN์„ ๊ตฌํ˜„ํ•ด๋ด.
Model:
๋„ค, ๋ฌผ๋ก ์ด์—์š”! PyTorch๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•œ CNN์„ ๊ตฌํ˜„ํ•˜๋Š” ์˜ˆ์ œ ์ฝ”๋“œ๋ฅผ ๋ณด์—ฌ๋“œ๋ฆด๊ฒŒ์š”.
์šฐ์„ , ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€์ ธ์™€์•ผ ํ•ด์š”. PyTorch์˜ `torch`์™€ `nn` ๋ชจ๋“ˆ์ด ํ•„์š”ํ•ด์š”.
```python
import torch\nimport torch.nn as nn
```
๊ทธ ๋‹ค์Œ, ๊ฐ„๋‹จํ•œ CNN ๋ชจ๋ธ์„ ์ •์˜ํ•  ๊ฑฐ์˜ˆ์š”. ์ด ๊ฒฝ์šฐ์—๋Š” 2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด์™€ 3๊ฐœ์˜ ํ’€๋ง ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ชจ๋ธ์ด์—์š”.
```python
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x)))
x = self.pool(torch.relu(self.conv2(x)))
return x
```
์ด์ œ ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•˜๊ณ , ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•ด์•ผ ํ•ด์š”. ์—ฌ๊ธฐ์„œ๋Š” ์ž„์˜์˜ 28x28 ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ–ˆ์–ด์š”.
```python
model = SimpleCNN()
input_data = torch.randn(1, 1, 28, 28)
```
๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ชจ๋ธ์— ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด ์ถœ๋ ฅ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์š”.
```python
output = model(input_data)
print(output.shape)
```
์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๊ฐ„๋‹จํ•œ CNN ๋ชจ๋ธ์„ PyTorch๋ฅผ ์‚ฌ์šฉํ•ด ๊ตฌํ˜„ํ•˜๊ณ , ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์ถœ๋ ฅ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์š”. ๋ฌผ๋ก  ์‹ค์ œ๋กœ๋Š” ๋” ๋งŽ์€ ๋ ˆ์ด์–ด๋‚˜ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ ๋“ฑ์ด ํ•„์š”ํ•˜์ง€๋งŒ, ์ด ์ฝ”๋“œ๋Š” ๊ธฐ๋ณธ์ ์ธ CNN ๊ตฌ์กฐ๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ์˜ˆ์ œ์ผ ๋ฟ์ด์—์š”. ๋” ๊ถ๊ธˆํ•œ ์ ์ด ์žˆ์œผ๋ฉด ์–ธ์ œ๋“ ์ง€ ๋ฌผ์–ด๋ด ์ฃผ์„ธ์š”!
</pre>
</details>
You can find more examples at [our project page](https://yonsei-mir.github.io/AkaLLaMA-page)
## Warning
Although AKALlama-70B has significant potential, its responses can sometimes be inaccurate, biased, or misaligned, presenting risks if used without additional testing and refinement. Furthermore, the quality of the model's output is greatly influenced by the system prompt and decoding strategy. Changes in these areas could result in less precise outputs. Therefore, we strongly recommend handling our model with considerable caution.
## Citation
If you find our model useful in your work, please cite the model as follows:
@misc{akallama,
author = {Chung, Jiwan and Jeon, Jaehyun and Kim, Saejin and Lim, Seungwon and Oh, Giyeong and Son, Yejin and Yu, Youngjae},
title = {AkaLlama: Yonsei University Large Language Model Project},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face repository},
howpublished = {\url{https://huggingface.co/mirlab/AkaLlama-llama3-70b-v0.1}},
}
## Contact
We look forward for your feedback and welcome collaboration on this exciting project!
### Contributors
- [YoungJaeYu](https://yj-yu.github.io/home/)
- [Yonsei MIRLab](https://mirlab.yonsei.ac.kr/)
## Special Thanks
- Data Center of the Department of Artificial Intelligence at Yonsei University for the computation resources
## Acknowledgement
- Title image generated by DALLยทE 3