์คํ๋ด์ฉ๊ณผ ํ ์คํธ ๋ฐ์ดํฐ์ ๋ํ ๋ถ๋ฅ ๊ฒฐ๊ณผ ๋ฆฌํฌํธ
์ฃผ๋ฌธ ๋ฌธ์ฅ์ ์ํด ํ์ต๋ midm์ nsmc (์ํ ๋ฆฌ๋ทฐ ๋ฐ์ดํฐ์ ) train dataset 3000๊ฐ๋ก ํ์ต์ ์์ผฐ๋ค. ์ฒ์์๋ 2000๊ฐ๋ฅผ ํ์ต์์ผฐ์ผ๋ ์ ํ๋๊ฐ ์์ํ ๊ฒ ๋งํผ ๋์ค์ง ์์ 1000๊ฐ๋ฅผ ๋ ํ์ต์์ผฐ๋๋ ์ฝ 2%์ ์ ํ๋๊ฐ ์ฌ๋ผ๊ฐ๋ค. ๊ทธ๋ฆฌ๊ณ 1000๊ฐ์ test dataset์ผ๋ก ํ ์คํธ๋ฅผ ํด๋ณด์๋ค. ์ ํ๋๋ 89.30 % ๊ฐ ๋์์ผ๋ฉฐ ํ๋ ๋ฐ ์ฌ์ง์ผ๋ก ํ์ธํ ์ ์๋ค. ์ด ์คํ์ ๋ค๋ฅธ ๋ฐ์ดํฐ ์ ๊ณผ ๋ค๋ฅธ ์๊ตฌ๋ก ํ์ต๋์ด์๋ LLM์ ์ ๋ฐ์ดํฐ ์ ๊ณผ ์ ์๊ตฌ๋ก ๋ฏธ์ธํ๋ ํ์์ ๋ ์ ํ๋๊ฐ ์ผ๋งํผ ๋์ค๋์ง๋ฅผ ๋ณด์ฌ์ฃผ๋ ๊ฒ์ด๋ค.
ํ
lora-midm-7b-food-order-understanding
This model is a fine-tuned version of KT-AI/midm-bitext-S-7B-inst-v1 on an unknown dataset.
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- training_steps: 300
- mixed_precision_training: Native AMP
Training results
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0
Model tree for kiyeon1221/lora-midm-7b-food-order-understanding
Base model
KT-AI/midm-bitext-S-7B-inst-v1